NTT DATA FORMAT

Published on Slideshow
Static slideshow
Download PDF version
Download PDF version
Embed video
Share video
Ask about this video

Scene 1 (0s)

NTT – Luiss Guess Experience. 16/11/2023. BUSINESS AND MARKETING ANALYTICS.

Scene 2 (10s)

Canio Mancaniello Executive Manager with experience in defining EDWH and in complex transformation projects towards the Cloud Data Platform. Computer Engineer with specialization in artificial intelligence and MBA obtained in 2014..

Scene 3 (34s)

Antonio Sisbarra Advanced Data Engineer with experience in defining complex solutions and transformations in projects towards the Cloud Data Platform. Master Degree in Computer Science with specialization in Data Analytics obtained in 2019..

Scene 4 (52s)

agenda. NTT Data Group. Data Intelligence. Use Case Experience.

Scene 5 (1m 0s)

4 zzz o Inbox: New Thc Ori •inal Emo-i by Shigetaka 9 o iTo- 13 4 5mn fi FAX cvs.

Scene 6 (1m 8s)

NTT: group. Billions in annual revenues. $500. Millions of investments in a startup fund on innovative technologies.

Scene 7 (1m 37s)

NTT data: Group. Shape Description automatically generated with medium confidence.

Scene 8 (2m 4s)

NTT DATa: italy. We anticipate the future with intelligence, in a collaborative environment where we continue to grow together, as a community and a society. We support the digital growth of the country and of the companies with which we collaborate, in Italy and abroad, generating more and more synergies and collaborations to face more important challenges, more innovative, inclusive and sustainable projects..

Scene 9 (2m 29s)

NTT DATA Italy matrix organization. SECTORS CAPABILITIES Telco & Media Financial Services Utilities & Energy Industry Public Sector AIS - Application & Infrastructure Services SES - SAP & Enterprise Solutions DST - Digital Strategy & Technology.

Scene 10 (2m 40s)

DST Italy Internal organization. Business Advisory.

Scene 11 (2m 57s)

agenda. NTT Data Group. Data Intelligence. Use Case Experience.

Scene 12 (3m 5s)

The House of D&I. COMMUNITY. D&I Strategy Data Consulting Data Governance Data Analysis Data PM / Agile.

Scene 13 (3m 21s)

Data Intelligence Italy. Data Strategy & Consulting.

Scene 14 (3m 47s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 15 (3m 57s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 16 (4m 7s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 17 (4m 17s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 18 (4m 27s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 19 (4m 37s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 20 (4m 46s)

Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.

Scene 21 (4m 56s)

E2E Capabilities. Solution Design Solution Implementation Analysis & Feasibility Change Management Standard definition Best Practice Auditing and Monitoring Data Model design Function Model design Data Management design Interfaces design Regulatory rules design Implementation and management processes Business rules implementation Regulatory (Privacy, SOX, etc.) rules implementation Interfaces to access reporting/analysis (channels) implementation User test System deployment Documentation and deliverables DeVops Training Organizational change management support Technological evolution support Regulatory (privacy, security, etc.) processes definition support Strategy evolution architecture Cloud approach Governance and project management support Methodological approach Business assessment Analysis model definition Information source discovery Roles and responsibility matrix Objectives definition support KPIs definition support Bus Matrix Standard & Best Practice RollOut Conceptual Buz model definition Strategy, Consulting and Governance Requirements definition Application Maintenance support solutions service PoC definition approach (“Try&Buy”) Feasibility study Technical and business match analysis Roadmap definition Products installation and maintenance AMS support Environmental management.

Scene 22 (5m 35s)

agenda. NTT Data Group. Data Intelligence. Use Case Experience.

Scene 23 (5m 43s)

Real world Use Case: from DB to Analytics in a big Business Company.

Scene 24 (5m 55s)

Main Goal (1/2). Unified star schema of tables in Databricks Delta Lake: Build library to download data from Synapse Optimize notebooks process extract and load data in Delta Lake Optimize tables in Delta Lake for stakeholders Schedule the notebooks to extract and load data Switch to Synapse only and leave Teradata sources.

Scene 25 (6m 12s)

Main Goal (2/2). Unified star schema of tables in Databricks Delta Lake: Build library to download data from Synapse Optimize notebooks process extract and load data in Delta Lake Optimize tables in Delta Lake for stakeholders Schedule the notebooks to extract and load data Switch to Synapse only and leave Teradata sources.

Scene 26 (6m 19s)

[image] Immagine che contiene testo schermata Carattere Marchio Descrizione generata automaticamente.

Scene 27 (6m 28s)

Intro to Databricks Cloud platform: What is Databricks?.

Scene 28 (6m 48s)

What is a notebook?. Set of cells, containing code or descriptive text Each cell can be executed separately from each other, using a different language (SQL, Python, Shell commands) Each cell can contain: One or more transformations on tables, variables, data structures Read data from lots of sources (batch, streaming, databases) Make data analysis and apply Data Science algorithms Each notebook can be scheduled in a workflow, also applying dependencies towards other notebooks.

Scene 29 (7m 12s)

What is Delta Lake DB?. Open format: open-source Apache Parquet format and is fully compatible with the Apache Spark ACID transactions: Delta Lake enables ACID (atomicity, consistency, isolation, durability) Time travel: Delta Lake’s transaction log provides a master record of every change made to the data, which makes it possible to recreate the exact state of a data set at any point in time. Data versioning makes data analyses and experiments completely reproducible. Schema enforcement and Schema Evolution Merge, update, delete: Delta Lake supports data manipulation language (DML) operations including merge, update, and delete commands.

Scene 30 (7m 40s)

Immagine che contiene Carattere, logo, Elementi grafici, testo Descrizione generata automaticamente.

Scene 31 (7m 53s)

ASIS: “BA” database in Databricks.

Scene 32 (8m 0s)

Weak points in the BA process. Teradata library to download data: lots of manual settings to make No optimization on BA tables (no partitioning, no caching) Manual run of each notebook to populate the start schema (more than 50 runs each month by hand :O ) No parametrization of code No comments and descriptions on main parts of notebooks Lots of “unofficial” notebooks in the official path of Business Analytics team.

Scene 33 (8m 20s)

TOBE: The new «CDM» DB from Synapse (Cloud). Microsoft Azure databricks Workflows Jobs Caricamento CDM Q Search data, notebooks, recents, and 2Domenica SWO O Tteo CTRL + p pmllabdbrks01a v cw Asslcu. sa razz ntt3qxm@posteitaliane.it Run now Runs Tasks o CW sa MESE.

Scene 34 (8m 33s)

TOBE: Extracting data from Synapse. Immagine che contiene testo, schermata, software, numero Descrizione generata automaticamente.

Scene 35 (8m 45s)

Improvements in Databricks Delta Lake. Lots of query of Data Stakeholders on last ANNOMESE Partitioning of tables for ANNOMESE From minutes to seconds to read portions of data Temporary Views on Databricks, better than complex query on Synapse Data source Spark Lazy approach.

Scene 36 (9m 14s)

TOBE: Better notebooks organization in Databricks.

Scene 37 (9m 59s)

TOBE: Notebooks workflow scheduling and temporal window.

Scene 38 (10m 25s)

Future works: improve Analytics approach (1/5). As of now, there are only handwritten rules to categorize clients Current situation of analytics: notebooks to get statistics but not predictions or automatic smart classification Lots of correlations and data to explore and discover, but in a modern and smart approach.

Scene 39 (11m 2s)

Future works: improve Analytics approach (2/5). As of now, there are only handwritten rules to categorize clients Current situation of analytics: notebooks to get statistics but not predictions or automatic smart classification Lots of correlations and data to explore and discover, but in a modern and smart approach.

Scene 40 (11m 35s)

Future works: improve Analytics approach (3/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.

Scene 41 (11m 53s)

Future works: improve Analytics approach (4/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.

Scene 42 (12m 3s)

Future works: improve Analytics approach (5/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.

Scene 43 (12m 15s)

Conclusions. More than 30 notebooks migrated to CDM (Synapse data source) Build a scheduling logic and workflow for new DB Improved performance in reading tables for analysis Reusable code, cleaner notebooks, descriptive cells Working in SQL and Python Easier ingestion with Synapse library, no more manual credentials Simple Dual Load with SQL and Python ML Models to train in the future business context Data Science in a smarter way.

Scene 44 (12m 35s)

Restricted NTT DATA Italia S.p.A NTT DATA Italia S.p.A.