What’s New in Talend Winter ‘17

Transform the data lake into qualified, clean data that anyone can use


Make Better, Faster Decisions

Talend Winter ’17 delivers the latest innovations so everyone can extract the insights they need from data.


Accelerate Big Data

Data Preparation for Big Data

Ensure Data

New Data Stewardship App

Stay on the Cutting Edge

One of the first to support Spark 2.0

Let anyone access and cleanse big data while governing its use.

Empower any decision maker with self-service tools to curate, catalog, cleanse, and shape data from the data lake for use anywhere. Accelerate your data preparations by running them on Spark and then make it easy for others to reuse. Quickly access the data you need with new big data connectors for CSV, Parquet, Avro, JSON and HDFS (with Kerberos) and a new JDBC connector for traditional data sources. Your data experts design the integration rules, while IT governs the use of data and facilitates collaboration across the enterprise.

Empower business to curate their data with the new Data Stewardship App.

Easily process and quickly resolve any data integrity issue to achieve “trusted” data across the enterprise. Define common data models, semantics, and rules needed to cleanse and validate data, then define user roles, workflows, and priorities, and delegate tasks to the people that know the data best.

Improve productivity in your data curation tasks by matching and merging data, resolving data errors, certification, and arbitration. Embed governance and stewardship into data integration flows, MDM initiatives, and matching processes. Monitor and audit your stewardship campaigns and data error resolutions.

Keep your technology on the cutting edge.

Innovate faster through support for the latest big data and cloud technologies. Building real-time integrations is now easier with Spark 2.0, Spark SQL and Structured Streaming (in technical preview). Since Talend Studio generates native and optimized code for Spark and Hadoop, migration is done with the push of a button.

Expand your big data architecture options with new components for MapR-DB and MapR-Streams, plus a new Talend Exchange connector for Snowflake. With more data sources in the data lake, Apache Atlas integration provides data lineage across the Hadoop cluster, so you know exactly where data came from and how it has been processed.

Improve project reusability with joblets for Spark Batch and Spark Streaming, and reinforce big data security with HDFS transparent encryption.

Additional Updates

To get the technical details on what’s new for these products, go to the Technical Note.

Talend Integration Cloud

Empower your organization to derive faster insights with cloud data lakes:

  • Create and manage role-specific configuration and security policies across the software development lifecycle (SDLC)
  • Parallel process large data sets with AWS S3 multi-part support
  • Start/Stop/Monitor a flow using external schedulers (through Talend Integration Cloud's public API)
  • Support for Amazon SQS improves your messaging options
  • Support for AWS IAM cross-account roles or instance roles improves IT governance
  • Updated support for Salesforce and Salesforce Wave (Summer ’16)

Data Integration/Unified Platform

Improve your productivity and project security:

  • Smart tMap component with “Auto Type Convert” and “Type Convert by Convention” options simplifies mapping field
  • Improved Talend Administration Console (TAC) security to better manage users, groups, roles, permissions, and password policies using SSO / SAML (Active Directory, SiteMinder, OKTA, Google
  • Support for AWS IAM cross-account roles or instance roles improves IT governance
  • Enhanced ELK (Elasticsearch, Logstash, Kibana) support makes it faster and easier to search and analyze large datasets

Data Quality

Increase the integrity of data as it flows through the business:

  • Define custom semantic types so you can use your own business language to discover, protect, and manage your data
  • Manage your contact data with enhanced address validation and enrichment components for MelissaData, QAS and Loqate
  • Automate data quality with machine learning through big data matching components (tMatchPairing, tMatchPredict, tMatchModel) on Spark (was technical preview in Talend Summer ‘16)
  • Enhance data privacy with a new data shuffling component and enhanced masking for emails and social security numbers
  • Accelerate and improve accuracy of matching with multi-pass matching and support for Hamming algorithm
  • Run data quality checks in the data lake with native Spark components for tGenKey, tVerifyEmail, tStandardizePhoneNumber, tRuleSurvivorship


Design, ingest, author, curate, and update your master data faster:

  • Custom UI designer to turn data authoring and error resolution into well-managed and automated workflows
  • Data stewardship integration to delegate data curation to the people that know the data best
  • Improved performance when importing and exporting data
  • Improved impact analysis with more comprehensive feedback

Talend Data Mapper
Support for HL7 V2 healthcare data format

Extend Your Integration Reach

Talend Studio features over 900 enterprise application components and connectors. For a complete list, go to Talendforge.org.

New and Enhanced Hadoop and NoSQL Platforms

Cloudera: 5.7 | Hortonworks: 2.4MapR: 5.1 | Spark 1.6 | Amazon EMR 4.5, 4.6 | Microsoft Azure HDInsight 3.4 | Cassandra 3.4 | MongoDB 3.2 | AWS DynamoDB

New and Updated Components

Amazon SQS (Simple Queue Service) | AWS S3 Multipart support | MapR-DB | MapR-Streams | Salesforce (Summer '16), Salesforce Wave (Summer ’16) Marketo | Microsoft SQL Server 16 | Bonita | DropBox V2 | Apache Camel 2.17.3 | Apache Karaf 4.0.6 | Apache ActiveMQ 5.14.0 | Apache CXF 3.1.7 | Spring Boot 1.3.7