Descriptive analysis

Published on Slideshow
Static slideshow
Download PDF version
Download PDF version
Embed video
Share video
Ask about this video

Scene 1 (0s)

[Audio] Descriptive analysis using SPSS Mekdes T.( Assist Professor of Epidemiology and Biostatistics).

Scene 2 (12s)

[Audio] Outline ❖Importing data ❖ Data cleaning ❖Handling variables ❖(Defining, creating, computing, recoding, adding and deleting) ❖Descriptive analysis ❖ Proportion, mean, median, standard deviation, Skewness, kurtosis 6/ 18/ 2022 BY MEKDS T. YILMA 2.

Scene 3 (44s)

[Audio] Working with data in Spss Opening: existed SPSS data file Importing data: from Excel, csv, database, text To open existed SPSS file ➔ you can double click on the file OR Click File ➔ Open ➔ Data. The Open Data window will pop up. ➔ 6/ 18/ 2022 BY MEKDS T. YILMA 3.

Scene 4 (1m 23s)

[Audio] To import data from database File➔ Import data ➔ database➔ new query ➔welcome wizard window will be displayed➔ select Ms acess database ➔Next or (you can also double click on it)➔browse your mdb file➔ from the select data➔ drag all variables to retrieve field➔ click next➔ I n define variable window (for single table) or specify relation ship window (multiple tabel) manage your variables or join the table➔ finish to import 6/ 18/ 2022 BY MEKDS T. YILMA 4.

Scene 5 (2m 1s)

[Audio] Importing from Epidata Open Epidata➔ export data➔ browse your rec file ➔ select and open➔ ok➔ now the data became exported in two format . spss and . txt file➔ click on . SPSS file➔ the SPSS syntax window➔ scroll down to the last SAVE line➔ cancel the asterix(*)➔ run the syntax (in the SPSS syntax menu bar)➔ all 6/ 18/ 2022 BY MEKDS T. YILMA 5.

Scene 6 (2m 39s)

[Audio] Data cleaning ➢This is the initial step for any statistical analysis➔ aimed to check and address any errors ➢ Data processing errors are errors that occur after data have been collected. Those common errors are ➢ Transpositions (e.g., 19 becomes 91 during data entry) ➢Copying errors (e.g., 0 ( zero) becomes O during data entry) ➢Coding errors ( e.g., a racial group gets improperly coded because of changes in the coding scheme) ➢ Routing errors (e.g., the interviewer asks the wrong question or asks questions in the wrong order) ➢ Consistency errors (contradictory responses, such as the reporting of a hysterectomy after the respondent has identified himself as a male and pregnancy vs male) ➢Range errors ( responses outside of the range of plausible answers, such as a reported age of 110 ) 6/ 18/ 2022 BY MEKDS T. YILMA 6.

Scene 7 (3m 52s)

[Audio] How to prevent data processing errors ✓Manual checks during data collection (e.g., checks for completeness, handwriting legibility) ✓ Range and consistency checking during data entry (e.g., preventing impossible results, such as ages greater than 110) ✓Double entry and validation following data entry ✓ Data analysis screening for outliers during data analysis ✓Descriptive analysis 6/ 18/ 2022 BY MEKDS T. YILMA 7.

Scene 8 (4m 33s)

[Audio] Descriptive analysis Exploratory and summary statistics ✓ You can simple right click on the variable and select descriptive statistics➔ generate frequency distribution ➔ 6/ 18/ 2022 BY MEKDS T. YILMA 8.

Scene 9 (4m 58s)

[Audio] OR Analyze➔ descriptive statistics➔ frequencies➔ statistics 6/ 18/ 2022 BY MEKDS T. YILMA 9.

Scene 10 (5m 29s)

[Audio] SPSS Descriptive 6/ 18/ 2022 BY MEKDS T. YILMA 10.

Scene 11 (5m 53s)

[Audio] SPSS Explore 6/ 18/ 2022 BY MEKDS T. YILMA 11.

Scene 12 (6m 23s)

[Audio] Graphs in SPSS This window remind you to define value label to the variable If you already define it ➔ OK If not then click on define variable properties ✓ Select the type of graph ✓ Click on the type of selected graph ✓ Then drag and drop it in the preview ✓ Select the variable in each axis ✓ Finally format the graph title, footnote…. Then OK 6/ 18/ 2022 BY MEKDS T. YILMA 12.

Scene 13 (7m 10s)

[Audio] Common graph for diagnosis and cleaning ➢ Histogram/ stem and leaf plot➔ distribution of the variable ➢ Normal Q-Q plot➔ normality➔ ➢ Box and whisker plot➔ outliers ➢ Scatter plot➔ Correlation ➢ Correlation matrix ➔ Correlation 6/ 18/ 2022 BY MEKDS T. YILMA 13.

Scene 14 (7m 43s)

[Audio] Handling variables ❑In order for your data analysis to be accurate, it is imperative that you correctly identify the type and formatting of each variable. ❑ Information for the type of each variable is displayed in the Variable View tab. ❑ Click the Variable View tab, locating the variable, and clicking on the cell beneath the " Type" column. A blue "…" button will appear. Clicking the blue "…" button opens the Variable Type window. Select the appropriate type for the variable 6/ 18/ 2022 BY MEKDS T. YILMA 14.

Scene 15 (8m 30s)

[Audio] Important tip When you are dealing with Numeric: check the type of numeric variable(continuous or discrete) ✓ Note: numeric value coded to nominal and ordinal variable should not be used for mathematical calculation String: is alphanumeric variables or character variables which have values that are treated as text. ✓In the Data View window, missing string values will appear as blank cells. However, these blank cells are not recognized by SPSS as system-missing values. ✓SPSS considers event➔ this has important implications if you plan blank strings to be non-missing ➔ it will affect your sample size. Date: treated as a special type of numeric variable. 6/ 18/ 2022 BY MEKDS T. YILMA 15.

Scene 16 (9m 37s)

[Audio] Tips continued SPSS date format 6/ 18/ 2022 BY MEKDS T. YILMA 16.

Scene 17 (10m 19s)

[Audio] SPSS duration format Tips continued 6/ 18/ 2022 BY MEKDS T. YILMA 17.

Scene 18 (10m 36s)

[Audio] Changing the variable type from string or numeric to a date/time format 1. Step 1: Define the variable as date/time and select the format in which your dates/ times currently appear. 2.Step 2: after you have specified the current format of date/time values for that variable, you can then change the format of the date following the same steps you used to define the variable type and date format during the first step. ❑ Note : to apply these steps your variable values already appear in a standard date/time format. your variable is currently defined as "string" or "numeric" rather than date/time. 6/ 18/ 2022 BY MEKDS T. YILMA 18.

Scene 19 (11m 33s)

[Audio] Transformations and calculations that involve date and time variables using Date and time wizard 6/ 18/ 2022 BY MEKDS T. YILMA 19.

Scene 20 (12m 15s)

[Audio] Date and time wizard Can assist you with •Creating a date/time variable from a string containing a date or time •Creating a date/time variable from variables that contain parts of dates or times •Calculating with dates and time(to calculate elapsed time) •Extracting parts of dates or time •Assigning periodicity to a dataset for time series data 6/ 18/ 2022 BY MEKDS T. YILMA 20.

Scene 21 (12m 56s)

[Audio] Defining variables It involves defining the name, type, label, value, missing ans so on 6/ 18/ 2022 BY MEKDS T. YILMA 21.

Scene 22 (13m 17s)

[Audio] In the variable View tab displays Name: To change a variable's name, ➢ Double-click on the name of the variable that you wish to re-name. ➢Type your new variable name. Type: To change a variable's type, ➢Click inside the cell corresponding to the "Type" column for that variable. ➢A square "... " button will appear; click on it to open the Variable Type window. ➢Click the option that best matches the type of variable. ➢Click OK WIDTH: IS The number of digits displayed for numerical values or the length of a string variable. To set a variable's width, ➢Click inside the cell corresponding to the " Width" column for that variable. ➢Then click the "up" or "down" arrow icons to increase or decrease the number width. 6/ 18/ 2022 BY MEKDS T. YILMA 22.

Scene 23 (14m 29s)

[Audio] Defining variable cont DECIMALS: Is the number of digits to display after a decimal point for values of that variable. ◦ Note that it does not apply to string variables and this changes how the numbers are displayed, but does not change the values in the dataset. To specify the number of decimal places for a numeric variable, ➢Click inside the cell corresponding to the " Decimals" column for that variable. ➢Then click the "up" or "down" arrow icons to increase or decrease the number of decimal places. LABEL: Is brief but descriptive definition or display name for the variable. When defined, a variable's label will appear in the output in place of its name. ➢ Double-click on the label of the variable that you wish to label. ➢Type the description of the variable. 6/ 18/ 2022 BY MEKDS T. YILMA 23.

Scene 24 (15m 39s)

[Audio] Defining variable cont VALUES: Value labels are useful mainly for categorical. To define value label ➢Click the cell that corresponds to the variable whose values you wish to label. ➢ If the values are currently undefined, the cell will say " None." ➢Click the square "…" button. The Value Labels window appears. ➢Add the number in the value and the description in the label ➢Then Add➔ OK 6/ 18/ 2022 BY MEKDS T. YILMA 24.

Scene 25 (16m 25s)

[Audio] Defining variable cont Missing: To set user-defined missing value ( number, dot or codes), ➢Click inside the cell corresponding to the "Missing" column for that variable.➔ the missing dialog box ➢Click the option that best matches how you wish to define missing data and enter any associated values, ➢Then click OK COLUMNS: The width of each column in the spreadsheet. To set a variable's column width, ➢Click inside the cell corresponding to the " Columns" column for that variable. ➢Then click the "up" or "down" arrow icons to increase or decrease the column width. ALIGN: The alignment of content in the cells of the SPSS Data View spreadsheet. To set the alignment for a variable, ➢Click inside the cell corresponding to the " Align" column for that variable. ➢Then use the drop-down menu to select your preferred alignment: Left, Right, or Center. MEASURE: The level of measurement for the variable (e.g., nominal, ordinal, or scale). To define a variable's measurement level, ➢Click inside the cell corresponding to the " Measure" column for that variable. ➢Then click the drop-down arrow to select the level of measurement for that variable: Scale, Ordinal, or Nominal. 6/ 18/ 2022 BY MEKDS T. YILMA 25.

Scene 26 (18m 17s)

[Audio] ROLE: Is the role that a variable will play in your analyses (i.e., independent variable, dependent variable, both independent and dependent). ➢ Input: The variable will be used as a predictor (independent variable). ➢ Target: The variable will be used as an outcome (dependent variable). ➢ Both: The variable will be used as both a predictor and an outcome ➢ None: The variable has no role assignment. ➢ Partition: The variable will partition the data into separate samples. ➢ Split: Used with the IBM® SPSS® Modeler (not IBM® SPSS® Statistics). To define a variable's role in your analysis, ➢ Click inside the cell corresponding to the " Role" column for that variable. ➢ Then use the drop-down menu to select the role that variable will take: Defining variable cont 6/ 18/ 2022 BY MEKDS T. YILMA 26.

Scene 27 (19m 37s)

[Audio] You can define variable from menu bar using Define Variable Properties Data➔ define variable properties ➔ select the variable and define the variable in the dialog box 6/ 18/ 2022 BY MEKDS T. YILMA 27.

Scene 28 (20m 16s)

[Audio] Variable transformation and data management ➢Variable transformation involves recoding, merging, generating and computing variables ➢ Data management involves Sorting, Splitting, weighting and Partitioning 6/ 18/ 2022 BY MEKDS T. YILMA 28.

Scene 29 (20m 43s)

[Audio] Recoding ➢ Recoding a variable: used to transform an existing variable into a different form based on certain criteria. ➢It can be done by combining some of the variable categories or values together ➢Used to change a continuous variable into an ordinal categorical variable ➢Used to merge the categories of a nominal variable ➢ Automatic Recode is also used to quickly convert a string categorical variable into a numeric categorical variable. 6/ 18/ 2022 BY MEKDS T. YILMA 29.

Scene 30 (21m 30s)

[Audio] Recode into Different Variables Transform➔ recode in to different variable ➔ in the new dialog box select and move the input variable ➔ define the name and the label of the output variable➔ click change➔ then click Old and New values➔ in the new dialog box define the old and new values➔ add➔ continue➔ OK 6/ 18/ 2022 BY MEKDS T. YILMA 30.

Scene 31 (22m 8s)

[Audio] Recode in to different var cont Note: ➢When recoding variables, always handle the missing values first! The most common recoding errors happen when you don't tell SPSS explicitly what to do with missing values. ➢This procedure does not include the ability to add value labels to the new categories, so immediately after recoding, you should add value labels to your new numeric codes. 6/ 18/ 2022 BY MEKDS T. YILMA 31.

Scene 32 (22m 48s)

[Audio] Recode into Same Variables Transform➔ recode in to same variable ➔ in the new dialog box select the variable and move to variable box➔ Click on old and new value➔ in the new dialog box set the old and new values ➔ add ➔ continue➔ OK 6/ 18/ 2022 BY MEKDS T. YILMA 32.

Scene 33 (23m 15s)

[Audio] Automatic Recode ➢Recode categorical string variables to labeled numeric variables ➢The first step is resolving any issues with "mismatched" category strings. For example, if there are different capitalizations of the same word, space before or after category and so on ➢Then Transform➔ Automatic recode➔ in the new dialog box select and move the variable ➔ Enter a new name for the auto recoded variable in the New Name field, then click Add New Name.➔ check treat blank string values as user missing➔ OK 6/ 18/ 2022 BY MEKDS T. YILMA 33.

Scene 34 (24m 4s)

[Audio] Automatic Recode cont 6/ 18/ 2022 BY MEKDS T. YILMA 34.

Scene 35 (24m 18s)

[Audio] Compute Variable ➢Used to create new variables from existing variables by applying formulas. For examples it can used to • Convert the units of a variable from pound to kg • Use a subject's height and weight to compute their BMI • Compute a subscale score from items on a survey Transform➔ compute variable ➔ in the dialog box provide target variable name, the numeric expression and so on➔ ok 6/ 18/ 2022 BY MEKDS T. YILMA 35.

Scene 36 (25m 3s)

[Audio] Generating new variable by computing existing variables ➢Transform➔ compute variable ➔ in the dialog box provide target variable name, function group list, click All➔ in the function and special variable scroll down double click on mean➔ MEAN(?,?) will appear in the numeric expression box➔ replace ? With all variables(variables should be separated by comma inside the parentheses)➔ OK ➢Check the new variable in the variable list 6/ 18/ 2022 BY MEKDS T. YILMA 36.

Scene 37 (25m 48s)

[Audio] Generating new variable by computing existing variables continued ➢ Transform➔ compute variable ➔ in the dialog box provide target variable name, function group list, click All➔ in the function and special variable scroll down double click on Any➔ Any(?,?) will appear in the numeric expression box➔ replace ? With all variables(variables should be separated by comma inside the parentheses)➔ OK ➢Check the new variable in the variable list 6/ 18/ 2022 BY MEKDS T. YILMA 37.

Scene 38 (26m 34s)

[Audio] Splitting ➢ Splitting: used to organize statistical results into groups for comparison with out separating your data into two different files. ➢The splitting variable(s) should be nominal or ordinal categorical. ➢The data should be sorted first with respect to the splitting variable. To split go to ➢ Data➔ split file ➔ in the new dialog box select either compare groups or organize output by group➔OK 6/ 18/ 2022 BY MEKDS T. YILMA 38.

Scene 39 (27m 17s)

[Audio] Weighting ➢Used to allocate representative number of each cases especially when your data measures count in this case the " weight" is the number of occurrences. ➢This often happen in the large survey to adjust over or under representation of certain characteristics in your sample ➢To enable weight • Data➔ weight cases ➔ in the new dialog box to enable a weighting variable, click Weight cases by, then double-click on the name of the weighting variable in the left-hand column to move it to the Frequency Variable field. Click OK. ➢To turn off an enabled weighting variable, open Weight Cases window again, and click Do not weight cases. Click OK. 6/ 18/ 2022 BY MEKDS T. YILMA 39.

Scene 40 (28m 14s)

[Audio] Crosstab Click Analyze > Descriptive Statistics > Crosstabs.➔ in the new dialog box move the two variable 6/ 18/ 2022 BY MEKDS T. YILMA 40.

Scene 41 (28m 40s)

[Audio] Descriptive analysis 1. Now you have cleaned your data 2. Managed variables 3. Managed outliers 4. Run diagnostics➔ normality 5. Finally run frequencies, summary statistics for reporting perposes 6/ 18/ 2022 BY MEKDS T. YILMA 41.

Scene 42 (29m 19s)

[Audio] Exporting output Right click on the result➔ window➔ choose the output you want (selected)➔ choose the document type ➔ browse file directory ➔ set filename and save➔OK 6/ 18/ 2022 BY MEKDS T. YILMA 42.

Scene 43 (29m 59s)

[Audio] Thank You!!! 6/ 18/ 2022 BY MEKDS T. YILMA 43.