NTT – Luiss Guess Experience. 16/11/2023. BUSINESS AND MARKETING ANALYTICS.
Canio Mancaniello Executive Manager with experience in defining EDWH and in complex transformation projects towards the Cloud Data Platform. Computer Engineer with specialization in artificial intelligence and MBA obtained in 2014..
Antonio Sisbarra Advanced Data Engineer with experience in defining complex solutions and transformations in projects towards the Cloud Data Platform. Master Degree in Computer Science with specialization in Data Analytics obtained in 2019..
agenda. NTT Data Group. Data Intelligence. Use Case Experience.
4 zzz o Inbox: New Thc Ori •inal Emo-i by Shigetaka 9 o iTo- 13 4 5mn fi FAX cvs.
NTT: group. Billions in annual revenues. $500. Millions of investments in a startup fund on innovative technologies.
NTT data: Group. Shape Description automatically generated with medium confidence.
NTT DATa: italy. We anticipate the future with intelligence, in a collaborative environment where we continue to grow together, as a community and a society. We support the digital growth of the country and of the companies with which we collaborate, in Italy and abroad, generating more and more synergies and collaborations to face more important challenges, more innovative, inclusive and sustainable projects..
NTT DATA Italy matrix organization. SECTORS CAPABILITIES Telco & Media Financial Services Utilities & Energy Industry Public Sector AIS - Application & Infrastructure Services SES - SAP & Enterprise Solutions DST - Digital Strategy & Technology.
DST Italy Internal organization. Business Advisory.
agenda. NTT Data Group. Data Intelligence. Use Case Experience.
The House of D&I. COMMUNITY. D&I Strategy Data Consulting Data Governance Data Analysis Data PM / Agile.
Data Intelligence Italy. Data Strategy & Consulting.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
Data Intelligence Italy. xOps Data Science & Artificial Intelligence Data Experience Cloud & Hybrid Data Platform Data & Analytics Governance Data Strategy.
E2E Capabilities. Solution Design Solution Implementation Analysis & Feasibility Change Management Standard definition Best Practice Auditing and Monitoring Data Model design Function Model design Data Management design Interfaces design Regulatory rules design Implementation and management processes Business rules implementation Regulatory (Privacy, SOX, etc.) rules implementation Interfaces to access reporting/analysis (channels) implementation User test System deployment Documentation and deliverables DeVops Training Organizational change management support Technological evolution support Regulatory (privacy, security, etc.) processes definition support Strategy evolution architecture Cloud approach Governance and project management support Methodological approach Business assessment Analysis model definition Information source discovery Roles and responsibility matrix Objectives definition support KPIs definition support Bus Matrix Standard & Best Practice RollOut Conceptual Buz model definition Strategy, Consulting and Governance Requirements definition Application Maintenance support solutions service PoC definition approach (“Try&Buy”) Feasibility study Technical and business match analysis Roadmap definition Products installation and maintenance AMS support Environmental management.
agenda. NTT Data Group. Data Intelligence. Use Case Experience.
Real world Use Case: from DB to Analytics in a big Business Company.
Main Goal (1/2). Unified star schema of tables in Databricks Delta Lake: Build library to download data from Synapse Optimize notebooks process extract and load data in Delta Lake Optimize tables in Delta Lake for stakeholders Schedule the notebooks to extract and load data Switch to Synapse only and leave Teradata sources.
Main Goal (2/2). Unified star schema of tables in Databricks Delta Lake: Build library to download data from Synapse Optimize notebooks process extract and load data in Delta Lake Optimize tables in Delta Lake for stakeholders Schedule the notebooks to extract and load data Switch to Synapse only and leave Teradata sources.
[image] Immagine che contiene testo schermata Carattere Marchio Descrizione generata automaticamente.
Intro to Databricks Cloud platform: What is Databricks?.
What is a notebook?. Set of cells, containing code or descriptive text Each cell can be executed separately from each other, using a different language (SQL, Python, Shell commands) Each cell can contain: One or more transformations on tables, variables, data structures Read data from lots of sources (batch, streaming, databases) Make data analysis and apply Data Science algorithms Each notebook can be scheduled in a workflow, also applying dependencies towards other notebooks.
What is Delta Lake DB?. Open format: open-source Apache Parquet format and is fully compatible with the Apache Spark ACID transactions: Delta Lake enables ACID (atomicity, consistency, isolation, durability) Time travel: Delta Lake’s transaction log provides a master record of every change made to the data, which makes it possible to recreate the exact state of a data set at any point in time. Data versioning makes data analyses and experiments completely reproducible. Schema enforcement and Schema Evolution Merge, update, delete: Delta Lake supports data manipulation language (DML) operations including merge, update, and delete commands.
Immagine che contiene Carattere, logo, Elementi grafici, testo Descrizione generata automaticamente.
ASIS: “BA” database in Databricks.
Weak points in the BA process. Teradata library to download data: lots of manual settings to make No optimization on BA tables (no partitioning, no caching) Manual run of each notebook to populate the start schema (more than 50 runs each month by hand :O ) No parametrization of code No comments and descriptions on main parts of notebooks Lots of “unofficial” notebooks in the official path of Business Analytics team.
TOBE: The new «CDM» DB from Synapse (Cloud). Microsoft Azure databricks Workflows Jobs Caricamento CDM Q Search data, notebooks, recents, and 2Domenica SWO O Tteo CTRL + p pmllabdbrks01a v cw Asslcu. sa razz ntt3qxm@posteitaliane.it Run now Runs Tasks o CW sa MESE.
TOBE: Extracting data from Synapse. Immagine che contiene testo, schermata, software, numero Descrizione generata automaticamente.
Improvements in Databricks Delta Lake. Lots of query of Data Stakeholders on last ANNOMESE Partitioning of tables for ANNOMESE From minutes to seconds to read portions of data Temporary Views on Databricks, better than complex query on Synapse Data source Spark Lazy approach.
TOBE: Better notebooks organization in Databricks.
TOBE: Notebooks workflow scheduling and temporal window.
Future works: improve Analytics approach (1/5). As of now, there are only handwritten rules to categorize clients Current situation of analytics: notebooks to get statistics but not predictions or automatic smart classification Lots of correlations and data to explore and discover, but in a modern and smart approach.
Future works: improve Analytics approach (2/5). As of now, there are only handwritten rules to categorize clients Current situation of analytics: notebooks to get statistics but not predictions or automatic smart classification Lots of correlations and data to explore and discover, but in a modern and smart approach.
Future works: improve Analytics approach (3/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.
Future works: improve Analytics approach (4/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.
Future works: improve Analytics approach (5/5). Categorization by Clustering What is the best product to advertise? Which is the one that fits best with the needs of the specific client? Which model to adopt? Neural networks, Decision Trees, … Maximize probability of selling product.
Conclusions. More than 30 notebooks migrated to CDM (Synapse data source) Build a scheduling logic and workflow for new DB Improved performance in reading tables for analysis Reusable code, cleaner notebooks, descriptive cells Working in SQL and Python Easier ingestion with Synapse library, no more manual credentials Simple Dual Load with SQL and Python ML Models to train in the future business context Data Science in a smarter way.
Restricted NTT DATA Italia S.p.A NTT DATA Italia S.p.A.