Full Resource Library

The Future of Big Data

Big data is the catch-all term used to describe gathering, analyzing, and using massive amounts of digital information to improve operations. It is rapidly changing the way we live, shop, and approach daily life. Understand what big data is and how you can put it to work for you.

View Now

ETL vs ELT: Defining the Difference

The difference between ETL and ELT lies in where data is transformed into business intelligence and how much data is retained in working data warehouses. Discover what those differences mean for business intelligence, which approach is best for your organization, and why the cloud is changing everything.

View Now

iPaaS: What Cloud Integration Platforms Can Do for You

An integration platform as a service (iPaaS) is a managed solution for hosting, developing, and integrating cloud data and applications. The best iPaaS solutions include easy, graphic tools to help visualize and work with an overall business intelligence picture.

View Now

Building a Governed Data Lake in the Cloud

The main purpose of a Data Lake is to provide full and direct access to raw (unfiltered) organizational data as an alternative to storing varying and sometimes limited datasets in scattered, disparate data silos.

View Now

What is Master Data Management?

Master data management (MDM) is a method of enabling an organization to always work with—and make decisions based on—one version of current, ‘true’ data. Discover how it benefits a business, what challenges to plan for, and how to get started.

View Now

What is Database Integration?

Database integration is the process used to aggregate information from multiple sources and share a current, clean version of it across an organization. It is the operational core of big data. Here’s a look at the process, partners, and tools used in integration.

View Now

Data Quality Software

Learn more about Talend’s data quality solutions from the many resources on this web site, or download Talend Open Studio for Data Quality today and start benefiting from the leading open source data quality tool.

View Now

Data Quality Services

When you've identified your initial data quality services targets and your specific quality improvement goals, deploy a data quality services platform that lets you start small and scale out. Talend Enterprise Data Quality lets you do just that.

View Now

Data Quality Tools

With the free to download, free to use data quality tools in Talend Open Studio for Data Quality, you can gain valuable knowledge about the current condition of your organization's stored data.

View Now

Running a Job on YARN

In this tutorial, create a Big Data batch Job running on YARN, read data from HDFS, sort them and display them in the Console.

Watch Now

Running a Job on Spark

Learn how to create a Big Data batch Job using the Spark framework, read data from HDFS, sort them and display them in the Console.

Watch Now

Creating Cluster Connection Metadata from Configuration Files

In this tutorial, create Hadoop Cluster metadata by importing the configuration from the Hadoop configuration files.
This tutorial uses Talend Data Fabric Studio version 6 and a Hadoop cluster: Cloudera CDH version 5.4.
1. Create a new Hadoop cluster metadata definition
Ensure that the Integration perspective is selected.
In the Project Repository, expand Metadata, right-click Hadoop Cluster, and click Create Hadoop Cluster to open the wizard.
In the Name field of the Hadoop Cluster Connection wizard, type MyHadoopCluster_files. In the Purpose field, type Cluster connection metadata, in the Description field, type Metadata to connect to a Cloudera CDH 5.4 cluster, and click Next.

Watch Now


displaying pages of 11