R Under development (unstable) (2025-09-01 r88761 ucrt) -- "Unsuffered Consequences" Copyright (C) 2025 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > if (requireNamespace("testthat", quietly = TRUE)) { + library(testthat) + library(dataPreparation) + test_check("dataPreparation") + } dataPreparation 1.1.2 Type data_preparation_news() to see new features/changes/bug fixes. [1] "aggregate_by_key: I start to aggregate" [1] "aggregate_by_key: 6 columns have been constructed. It took 0.06 seconds. " [1] "find_and_transform_dates: It took me 2.08s to identify formats" [1] "find_and_transform_dates: It took me 0.14s to transform 4 columns to a Date format." [1] "find_and_transform_dates: It took me 0s to identify formats" [1] "find_and_transform_dates: There are no dates to transform.\n (If i missed something please provide the date format in inputs or\n consider using set_col_as_date to transform it)." [1] "identify_dates: column date_col seems to have an ambiguity, I try to solve it." [1] "V2" [1] "fast_discretization: V2 aren't columns of types numeric i do nothing for those variables." [1] "fast_discretization: I will build splits for 1 numeric columns using, equal_width method." [1] "fast_discretization: it took me: 0s to build splits for 1 numeric columns." [1] "fast_discretization: I will build splits for 1 numeric columns using, equal_freq method." [1] "fast_discretization: it took me: 0s to build splits for 1 numeric columns." [1] "fast_discretization: I will build splits for 1 numeric columns using, equal_width method." [1] "fast_discretization: it took me: 0s to build splits for 1 numeric columns." [1] "fast_discretization: I will build splits for 0 numeric columns using, equal_width method." [1] "fast_discretization: it took me: 0s to build splits for 0 numeric columns." [1] "fast_discretization: I will build splits for 1 numeric columns using, equal_width method." [1] "equal_width_splits: constant_col can't provide 10 equal width bins; instead you will have 0 bins." [1] "fast_discretization: column constant_col seems to be constant, I do nothing." [1] "fast_discretization: it took me: 0s to build splits for 0 numeric columns." [1] "equal_width_splits: data_set can't provide 10 equal width bins; instead you will have 0 bins." [1] "equal_freq_splits: data_set can't provide 10 equal freq bins; instead you will have 2 bins." [1] "fast_discretization: I will build splits for 1 numeric columns using, equal_width method." [1] "fast_discretization: it took me: 0s to build splits for 1 numeric columns." [1] "fast_discretization: I will discretize 1 numeric columns using, bins." [1] "fast_discretization: it took me: 0s to transform 1 numeric columns into, binary columns." [1] "un_factor: I will identify variable that are factor but shouldn't be." [1] "un_factor: I un-factor false_factor." [1] "un_factor: It took me 0s to un-factor 1 column(s)." [1] "un_factor: I will identify variable that are factor but shouldn't be." [1] "un_factor: I un-factor true_factor." [1] "un_factor: I un-factor false_factor." [1] "un_factor: It took me 0s to un-factor 2 column(s)." [1] "fast_filter_variables: I check for constant columns." [1] "fast_filter_variables: I delete 1 constant column(s) in data_set." [1] "fast_filter_variables: I check for columns in double." [1] "fast_filter_variables: I delete 1 column(s) that are in double in data_set." [1] "fast_filter_variables: I check for columns that are bijections of another column." [1] "fast_filter_variables: I delete 3 column(s) that are bijections of another column in data_set." [1] "fast_filter_variables: I check for columns that are included in another column." [1] "fast_filter_variables: I delete 1 column(s) that are bijections of another column in data_set." [1] "string_column" [1] "fast_round: string_column aren't columns of types numeric or integer i do nothing for those variables." [1] "string_column" [1] "fast_round: string_column aren't columns of types numeric or integer i do nothing for those variables." [1] "generate_from_character: it took me: 0s to transform 1 character columns into, 3 new columns." [1] "generate_from_character: it took me: 0s to transform 3 character columns into, 9 new columns." [1] "generate_from_character: it took me: 0s to transform 1 character columns into, 3 new columns." [1] "generate_factor_from_date: I will create a factor column from each date column." [1] "generate_factor_from_date: It took me 0s to transform 1 column(s)." [1] "ID" [1] "generate_date_diffs: ID aren't columns of types date i do nothing for those variables." [1] "generate_date_diffs: I will generate difference between dates." [1] "generate_date_diffs: It took me 0.01s to create 3 column(s)." [1] "date1" "date2" "date3" "date4" [5] "num1" "num2" "constant" "num3" [9] "age" "fnlwgt" "education_num" "capital_gain" [13] "capital_loss" "hr_per_week" [1] "generate_from_factor: c(\"date1\", \"date2\", \"date3\", \"date4\", \"num1\", \"num2\", \"constant\", \"num3\", \"age\", \"fnlwgt\", \"education_num\", \"capital_gain\", \"capital_loss\", \"hr_per_week\") aren't columns of types factor i do nothing for those variables." [1] "generate_from_factor: it took me: 0.02s to transform 10 factor columns into, 30 new columns." [1] "generate_from_factor: it took me: 0s to transform 4 factor columns into, 12 new columns." [1] "one_hot_encoder: Since you didn't provide encoding, I compute them with build_encoding." [1] "build_encoding: I will compute encoding on 1 character and factor columns." [1] "build_encoding: it took me: 0s to compute encoding for 1 character and factor columns." [1] "one_hot_encoder: I will one hot encode some columns." [1] "one_hot_encoder: I am doing column: character_col" [1] "one_hot_encoder: It took me 0s to transform 1 column(s)." [1] "build_encoding: I will compute encoding on 1 character and factor columns." [1] "build_encoding: it took me: 0s to compute encoding for 1 character and factor columns." [1] "build_encoding: I will compute encoding on 1 character and factor columns." [1] "build_encoding: it took me: 0s to compute encoding for 2 character and factor columns." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: grades." [1] "target_encode: Start to encode columns according to target." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: grades." [1] "target_encode: Start to encode columns according to target." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: target." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: target." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: target." [1] "real_cols: col_2 aren't columns of the table, i do nothing for those variables" [1] "col_2" [1] "real_cols: col_2 aren't columns of types numeric i do nothing for those variables." [1] "find_and_transform_numerics: It took me 0s to identify 2 numerics column(s), i will set them as numerics" [1] "find_and_transform_numerics: It took me 0s to transform 2 column(s) to a numeric format." [1] "find_and_transform_numerics: It took me 0.01s to identify 0 numerics column(s), i will set them as numerics" [1] "find_and_transform_numerics: There are no numerics to transform.(If i missed something consider using set_col_as_numeric to transform it)" [1] "prepare_set: step one: correcting mistakes." [1] "fast_filter_variables: I check for constant columns." [1] "fast_filter_variables: I check for columns in double." [1] "fast_filter_variables: I check for columns that are bijections of another column." [1] "fast_filter_variables: I delete 1 column(s) that are bijections of another column in data_set." [1] "age" "fnlwgt" "capital_gain" "capital_loss" "hr_per_week" [1] "un_factor: c(\"age\", \"fnlwgt\", \"capital_gain\", \"capital_loss\", \"hr_per_week\") aren't columns of types factor i do nothing for those variables." [1] "un_factor: I will identify variable that are factor but shouldn't be." [1] "un_factor: I un-factor education." [1] "un_factor: I un-factor occupation." [1] "un_factor: I un-factor country." [1] "un_factor: It took me 0s to un-factor 3 column(s)." [1] "find_and_transform_numerics: It took me 0s to identify 0 numerics column(s), i will set them as numerics" [1] "find_and_transform_numerics: There are no numerics to transform.(If i missed something consider using set_col_as_numeric to transform it)" [1] "find_and_transform_dates: It took me 1.17s to identify formats" [1] "find_and_transform_dates: There are no dates to transform.\n (If i missed something please provide the date format in inputs or\n consider using set_col_as_date to transform it)." [1] "prepare_set: step two: transforming data_set." [1] "age" "type_employer" "fnlwgt" "education" [5] "marital" "occupation" "relationship" "race" [9] "sex" "capital_gain" "capital_loss" "hr_per_week" [13] "country" "income" [1] "prepare_set: c(\"age\", \"type_employer\", \"fnlwgt\", \"education\", \"marital\", \"occupation\", \"relationship\", \"race\", \"sex\", \"capital_gain\", \"capital_loss\", \"hr_per_week\", \"country\", \"income\") aren't columns of types date i do nothing for those variables." [1] "generate_date_diffs: I will generate difference between dates." [1] "generate_date_diffs: It took me 0s to create 0 column(s)." [1] "generate_factor_from_date: I will create a factor column from each date column." [1] "generate_factor_from_date: It took me 0s to transform 0 column(s)." [1] "age" "type_employer" "fnlwgt" "marital" [5] "relationship" "race" "sex" "capital_gain" [9] "capital_loss" "hr_per_week" "income" [1] "prepare_set: c(\"age\", \"type_employer\", \"fnlwgt\", \"marital\", \"relationship\", \"race\", \"sex\", \"capital_gain\", \"capital_loss\", \"hr_per_week\", \"income\") aren't columns of types character i do nothing for those variables." [1] "generate_from_character: it took me: 0.02s to transform 2 character columns into, 6 new columns." [1] "build_target_encoding: Start to compute encoding for target_encoding according to col: capital_gain." [1] "target_encode: Start to encode columns according to target." [1] "aggregate_by_key: I start to aggregate" [1] "aggregate_by_key: 63 columns have been constructed. It took 0.19 seconds. " [1] "prepare_set: step three: filtering data_set." [1] "fast_filter_variables: I check for constant columns." [1] "fast_filter_variables: I delete 2 constant column(s) in result." [1] "fast_filter_variables: I check for columns in double." [1] "fast_filter_variables: I check for columns that are bijections of another column." [1] "fast_filter_variables: I delete 6 column(s) that are bijections of another column in result." [1] "country" [1] "fast_round: country aren't columns of types numeric or integer i do nothing for those variables." [1] "prepare_set: step four: handling NA." [1] "prepare_set: step five: shaping result." [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: I am doing the column country." [1] "set_col_as_factor: it took me: 0s to transform 1 column(s) to factor." [1] "shape_set: Transforming numerical variables into factors when length(unique(col)) <= 10." [1] "shape_set: Previous distribution of column types:" col_class_init character numeric 1 54 [1] "shape_set: Current distribution of column types:" col_class_end factor numeric 43 12 [1] "remove_sd_outlier: I start to filter categorical rare events" [1] "remove_sd_outlier: dropped 1 row(s) that are rare event on num_col." [1] "remove_sd_outlier: 1 have been dropped. It took 0 seconds. " [1] "remove_sd_outlier: I start to filter categorical rare events" [1] "remove_sd_outlier: dropped 0 row(s) that are rare event on num_col." [1] "remove_sd_outlier: 0 have been dropped. It took 0 seconds. " [1] "remove_rare_categorical: I start to filter categorical rare events" [1] "remove_rare_categorical: dropped 1 row(s) that are rare event on cat_col." [1] "remove_rare_categorical: 1 have been dropped. It took 0 seconds. " [1] "remove_percentile_outlier: I start to filter categorical rare events" [1] "remove_percentile_outlier: dropped 2 row(s) that are rare event on num_col." [1] "remove_percentile_outlier: 2 have been dropped. It took 0 seconds. " [1] "remove_percentile_outlier: I start to filter categorical rare events" [1] "remove_percentile_outlier: dropped 2 row(s) that are rare event on num_col." [1] "remove_percentile_outlier: 2 have been dropped. It took 0 seconds. " [1] "same_shape: verify that every column is present." [1] "same_shape: columns col_2 are missing, I create them." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: col_2 class was logical i set it to numeric." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: the following columns are in data_set but not in reference_set: I drop them: " [1] "col_2" [1] "same_shape: verify that every column is in the right type." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: col_1 class was character i set it to numeric." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: col_1 class was character i set it to c(\"POSIXct\", \"POSIXt\")." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: col_1 class had different levels than in reference_set I change it." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: col_1 class had different levels than in reference_set I change it." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: col_1 class was numeric i set it to weird_class." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: col_1 class was numeric i set it to weird_class." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: columns type_employer?, type_employerFederal-gov, type_employerLocal-gov, type_employerNever-worked, type_employerPrivate, type_employerSelf-emp-inc, type_employerSelf-emp-not-inc, type_employerState-gov, type_employerWithout-pay, education11th, education12th, education1st-4th, education5th-6th, education7th-8th, education9th, educationAssoc-acdm, educationAssoc-voc, educationBachelors, educationDoctorate, educationHS-grad, educationMasters, educationPreschool, educationProf-school, educationSome-college, maritalMarried-AF-spouse, maritalMarried-civ-spouse, maritalMarried-spouse-absent, maritalNever-married, maritalSeparated, maritalWidowed, occupationAdm-clerical, occupationArmed-Forces, occupationCraft-repair, occupationExec-managerial, occupationFarming-fishing, occupationHandlers-cleaners, occupationMachine-op-inspct, occupationOther-service, occupationPriv-house-serv, occupationProf-specialty, occupationProtective-serv, occupationSales, occupationTech-support, occupationTransport-moving, relationshipNot-in-family, relationshipOther-relative, relationshipOwn-child, relationshipUnmarried, relationshipWife, raceAsian-Pac-Islander, raceBlack, raceOther, raceWhite, sexMale, capital_loss1408, capital_loss1564, capital_loss1573, capital_loss1719, capital_loss1762, capital_loss1887, capital_loss1902, capital_loss2042, capital_loss2179, countryCambodia, countryCanada, countryChina, countryColumbia, countryCuba, countryDominican-Republic, countryEcuador, countryEl-Salvador, countryEngland, countryFrance, countryGermany, countryGreece, countryGuatemala, countryHaiti, countryHoland-Netherlands, countryHonduras, countryHong, countryHungary, countryIndia, countryIran, countryIreland, countryItaly, countryJamaica, countryJapan, countryLaos, countryMexico, countryNicaragua, countryOutlying-US(Guam-USVI-etc), countryPeru, countryPhilippines, countryPoland, countryPortugal, countryPuerto-Rico, countryScotland, countrySouth, countryTaiwan, countryThailand, countryTrinadad&Tobago, countryUnited-States, countryVietnam, countryYugoslavia, income>50K are missing, I create them." [1] "same_shape: drop unwanted columns." [1] "same_shape: the following columns are in data_set but not in reference_set: I drop them: " [1] "type_employer" "education" "marital" "occupation" [5] "relationship" "race" "sex" "capital_loss" [9] "country" "income" [1] "same_shape: verify that every column is in the right type." [1] "same_shape: age class was integer i set it to numeric." [1] "same_shape: fnlwgt class was integer i set it to numeric." [1] "same_shape: education_num class was integer i set it to numeric." [1] "same_shape: capital_gain class was integer i set it to numeric." [1] "same_shape: hr_per_week class was integer i set it to numeric." [1] "same_shape: type_employer? class was logical i set it to numeric." [1] "same_shape: type_employerFederal-gov class was logical i set it to numeric." [1] "same_shape: type_employerLocal-gov class was logical i set it to numeric." [1] "same_shape: type_employerNever-worked class was logical i set it to numeric." [1] "same_shape: type_employerPrivate class was logical i set it to numeric." [1] "same_shape: type_employerSelf-emp-inc class was logical i set it to numeric." [1] "same_shape: type_employerSelf-emp-not-inc class was logical i set it to numeric." [1] "same_shape: type_employerState-gov class was logical i set it to numeric." [1] "same_shape: type_employerWithout-pay class was logical i set it to numeric." [1] "same_shape: education11th class was logical i set it to numeric." [1] "same_shape: education12th class was logical i set it to numeric." [1] "same_shape: education1st-4th class was logical i set it to numeric." [1] "same_shape: education5th-6th class was logical i set it to numeric." [1] "same_shape: education7th-8th class was logical i set it to numeric." [1] "same_shape: education9th class was logical i set it to numeric." [1] "same_shape: educationAssoc-acdm class was logical i set it to numeric." [1] "same_shape: educationAssoc-voc class was logical i set it to numeric." [1] "same_shape: educationBachelors class was logical i set it to numeric." [1] "same_shape: educationDoctorate class was logical i set it to numeric." [1] "same_shape: educationHS-grad class was logical i set it to numeric." [1] "same_shape: educationMasters class was logical i set it to numeric." [1] "same_shape: educationPreschool class was logical i set it to numeric." [1] "same_shape: educationProf-school class was logical i set it to numeric." [1] "same_shape: educationSome-college class was logical i set it to numeric." [1] "same_shape: maritalMarried-AF-spouse class was logical i set it to numeric." [1] "same_shape: maritalMarried-civ-spouse class was logical i set it to numeric." [1] "same_shape: maritalMarried-spouse-absent class was logical i set it to numeric." [1] "same_shape: maritalNever-married class was logical i set it to numeric." [1] "same_shape: maritalSeparated class was logical i set it to numeric." [1] "same_shape: maritalWidowed class was logical i set it to numeric." [1] "same_shape: occupationAdm-clerical class was logical i set it to numeric." [1] "same_shape: occupationArmed-Forces class was logical i set it to numeric." [1] "same_shape: occupationCraft-repair class was logical i set it to numeric." [1] "same_shape: occupationExec-managerial class was logical i set it to numeric." [1] "same_shape: occupationFarming-fishing class was logical i set it to numeric." [1] "same_shape: occupationHandlers-cleaners class was logical i set it to numeric." [1] "same_shape: occupationMachine-op-inspct class was logical i set it to numeric." [1] "same_shape: occupationOther-service class was logical i set it to numeric." [1] "same_shape: occupationPriv-house-serv class was logical i set it to numeric." [1] "same_shape: occupationProf-specialty class was logical i set it to numeric." [1] "same_shape: occupationProtective-serv class was logical i set it to numeric." [1] "same_shape: occupationSales class was logical i set it to numeric." [1] "same_shape: occupationTech-support class was logical i set it to numeric." [1] "same_shape: occupationTransport-moving class was logical i set it to numeric." [1] "same_shape: relationshipNot-in-family class was logical i set it to numeric." [1] "same_shape: relationshipOther-relative class was logical i set it to numeric." [1] "same_shape: relationshipOwn-child class was logical i set it to numeric." [1] "same_shape: relationshipUnmarried class was logical i set it to numeric." [1] "same_shape: relationshipWife class was logical i set it to numeric." [1] "same_shape: raceAsian-Pac-Islander class was logical i set it to numeric." [1] "same_shape: raceBlack class was logical i set it to numeric." [1] "same_shape: raceOther class was logical i set it to numeric." [1] "same_shape: raceWhite class was logical i set it to numeric." [1] "same_shape: sexMale class was logical i set it to numeric." [1] "same_shape: capital_loss1408 class was logical i set it to numeric." [1] "same_shape: capital_loss1564 class was logical i set it to numeric." [1] "same_shape: capital_loss1573 class was logical i set it to numeric." [1] "same_shape: capital_loss1719 class was logical i set it to numeric." [1] "same_shape: capital_loss1762 class was logical i set it to numeric." [1] "same_shape: capital_loss1887 class was logical i set it to numeric." [1] "same_shape: capital_loss1902 class was logical i set it to numeric." [1] "same_shape: capital_loss2042 class was logical i set it to numeric." [1] "same_shape: capital_loss2179 class was logical i set it to numeric." [1] "same_shape: countryCambodia class was logical i set it to numeric." [1] "same_shape: countryCanada class was logical i set it to numeric." [1] "same_shape: countryChina class was logical i set it to numeric." [1] "same_shape: countryColumbia class was logical i set it to numeric." [1] "same_shape: countryCuba class was logical i set it to numeric." [1] "same_shape: countryDominican-Republic class was logical i set it to numeric." [1] "same_shape: countryEcuador class was logical i set it to numeric." [1] "same_shape: countryEl-Salvador class was logical i set it to numeric." [1] "same_shape: countryEngland class was logical i set it to numeric." [1] "same_shape: countryFrance class was logical i set it to numeric." [1] "same_shape: countryGermany class was logical i set it to numeric." [1] "same_shape: countryGreece class was logical i set it to numeric." [1] "same_shape: countryGuatemala class was logical i set it to numeric." [1] "same_shape: countryHaiti class was logical i set it to numeric." [1] "same_shape: countryHoland-Netherlands class was logical i set it to numeric." [1] "same_shape: countryHonduras class was logical i set it to numeric." [1] "same_shape: countryHong class was logical i set it to numeric." [1] "same_shape: countryHungary class was logical i set it to numeric." [1] "same_shape: countryIndia class was logical i set it to numeric." [1] "same_shape: countryIran class was logical i set it to numeric." [1] "same_shape: countryIreland class was logical i set it to numeric." [1] "same_shape: countryItaly class was logical i set it to numeric." [1] "same_shape: countryJamaica class was logical i set it to numeric." [1] "same_shape: countryJapan class was logical i set it to numeric." [1] "same_shape: countryLaos class was logical i set it to numeric." [1] "same_shape: countryMexico class was logical i set it to numeric." [1] "same_shape: countryNicaragua class was logical i set it to numeric." [1] "same_shape: countryOutlying-US(Guam-USVI-etc) class was logical i set it to numeric." [1] "same_shape: countryPeru class was logical i set it to numeric." [1] "same_shape: countryPhilippines class was logical i set it to numeric." [1] "same_shape: countryPoland class was logical i set it to numeric." [1] "same_shape: countryPortugal class was logical i set it to numeric." [1] "same_shape: countryPuerto-Rico class was logical i set it to numeric." [1] "same_shape: countryScotland class was logical i set it to numeric." [1] "same_shape: countrySouth class was logical i set it to numeric." [1] "same_shape: countryTaiwan class was logical i set it to numeric." [1] "same_shape: countryThailand class was logical i set it to numeric." [1] "same_shape: countryTrinadad&Tobago class was logical i set it to numeric." [1] "same_shape: countryUnited-States class was logical i set it to numeric." [1] "same_shape: countryVietnam class was logical i set it to numeric." [1] "same_shape: countryYugoslavia class was logical i set it to numeric." [1] "same_shape: income>50K class was logical i set it to numeric." [1] "same_shape: verify that every factor as the right number of levels." [1] "same_shape: verify that every column is present." [1] "same_shape: drop unwanted columns." [1] "same_shape: verify that every column is in the right type." [1] "same_shape: verify that every factor as the right number of levels." [1] "build_scales: I will compute scale on 1 numeric columns." [1] "build_scales: it took me: 0s to compute scale for 1 numeric columns." [1] "build_scales: I will compute scale on 1 numeric columns." [1] "build_scales: it took me: 0s to compute scale for 1 numeric columns." [1] "fast_scale: I will scale 1 numeric columns." [1] "fast_scale: it took me: 0s to scale 1 numeric columns." [1] "build_scales: I will compute scale on 1 numeric columns." [1] "build_scales: it took me: 0s to compute scale for 1 numeric columns." [1] "fast_scale: I will scale 1 numeric columns." [1] "fast_scale: it took me: 0s to scale 1 numeric columns." [1] "fast_scale: I will scale 1 numeric columns." [1] "fast_scale: it took me: 0s to unscale 1 numeric columns." [1] "build_scales: I will compute scale on 1 numeric columns." [1] "build_scales: it took me: 0s to compute scale for 1 numeric columns." [1] "set_col_as_numeric: I will set some columns as numeric" [1] "set_col_as_numeric: I am doing the column char_col_1." [1] "set_col_as_numeric: 0 NA have been created due to transformation to numeric." [1] "set_col_as_numeric: I am doing the column char_col_2." [1] "set_col_as_numeric: 0 NA have been created due to transformation to numeric." [1] "set_col_as_character: I will set some columns as character" [1] "set_col_as_character: I am doing the column numCol." [1] "set_col_as_character: I am doing the column factorCol." [1] "set_col_as_character: I am doing the column charcol." [1] "set_col_as_character: charcol is a character, i do nothing." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column date1." [1] "set_col_as_date:1 NA have been created due to transformation to Date." [1] "set_col_as_date: I am doing the column date2." [1] "set_col_as_date:1 NA have been created due to transformation to Date." [1] "set_col_as_date: it took me: 0s to transform 2 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column date2." [1] "set_col_as_date:1 NA have been created due to transformation to Date." [1] "set_col_as_date: it took me: 0s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column date1." [1] "set_col_as_date:1 NA have been created due to transformation to Date." [1] "set_col_as_date: it took me: 0.02s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column ID." [1] "set_col_as_date: it took me: 0s to transform 0 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column ID." [1] "set_col_as_date: Since i generated only NAs i set ID as it was before." [1] "set_col_as_date: it took me: 0.02s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column ID." [1] "set_col_as_date: ID doesn't seem to be a date, if it really is please provide format." [1] "set_col_as_date: it took me: 0s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column time." [1] "set_col_as_date: it took me: 0s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column time_stamp_s." [1] "set_col_as_date: it took me: 0s to transform 1 column(s) to Dates." [1] "set_col_as_date: I will set some columns as Date." [1] "set_col_as_date: I am doing the column time_stamp_ms." [1] "set_col_as_date: it took me: 0s to transform 1 column(s) to Dates." [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: I am doing the column col." [1] "set_col_as_factor: it took me: 0s to transform 1 column(s) to factor." [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: I am doing the column col." [1] "set_col_as_factor: it took me: 0s to transform 1 column(s) to factor." [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: I am doing the column col." [1] "set_col_as_factor: col has more than 2 values, i don't transform it." [1] "set_col_as_factor: it took me: 0s to transform 0 column(s) to factor." [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: it took me: 0s to transform 0 column(s) to factor." [1] "shape_set: Transforming numerical variables into factors when length(unique(col)) <= 10." [1] "shape_set: Previous distribution of column types:" col_class_init factor integer 9 6 [1] "shape_set: Current distribution of column types:" col_class_end factor integer 9 6 [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: it took me: 0s to transform 0 column(s) to factor." [1] "shape_set: Transforming numerical variables into factors when length(unique(col)) <= 10." [1] "shape_set: Previous distribution of column types:" col_class_init factor integer 9 6 [1] "shape_set: Current distribution of column types:" col_class_end factor integer 9 6 [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: it took me: 0s to transform 0 column(s) to factor." [1] "shape_set: Transforming numerical variables into factors when length(unique(col)) <= 10." [1] "shape_set: Previous distribution of column types:" col_class_init factor integer 9 6 [1] "shape_set: Current distribution of column types:" col_class_end factor integer 9 6 [1] "set_col_as_factor: I will set some columns to factor." [1] "set_col_as_factor: it took me: 0s to transform 0 column(s) to factor." [1] "shape_set: Transforming logical into binaries.\n" [1] "shape_set: Previous distribution of column types:" col_class_init logical 1 [1] "shape_set: Current distribution of column types:" col_class_end integer 1 [1] "which_are_constant: constantCol is constant." [1] "which_are_constant: it took me 0s to identify 1 constant column(s)" [1] "which_are_in_double: it took me 0s to identify 2 column(s) to drop." [1] "which_are_in_double: it took me 0s to identify 1 column(s) to drop." [1] "which_are_in_double: it took me 0s to identify 1 column(s) to drop." [1] "which_are_in_double: it took me 0s to identify 0 column(s) to drop." [1] "which_are_bijection: it took me 0.01s to identify 1 column(s) to drop." [1] "which_are_bijection: education is a bijection of education_num. I put it in drop list." [1] "which_are_bijection: it took me 0.02s to identify 1 column(s) to drop." [1] "which_are_bijection: it took me 0s to identify 0 column(s) to drop." [1] "which_are_included: education is included in column education_num." [1] "which_are_included: education_num is included in column education." [1] "which_are_included: are_50_or_more is included in column age." [1] "which_are_included: constant is included in column sex." [1] "which_are_included: sex is included in column fnlwgt." [1] "which_are_included: income is included in column id." [1] "which_are_included: race is included in column fnlwgt." [1] "which_are_included: relationship is included in column id." [1] "which_are_included: type_employer is included in column fnlwgt." [1] "which_are_included: marital is included in column id." [1] "which_are_included: occupation is included in column id." [1] "which_are_included: education is included in column education_num." [1] "which_are_included: education_num is included in column id." [1] "which_are_included: capital_gain is included in column fnlwgt." [1] "which_are_included: capital_loss is included in column fnlwgt." [1] "which_are_included: country is included in column fnlwgt." [1] "which_are_included: hr_per_week is included in column id." [1] "which_are_included: age is included in column id." [1] "which_are_included: mail is included in column id." [1] "which_are_included: date2 is included in column id." [1] "which_are_included: date1 is included in column id." [1] "which_are_included: date3 is included in column date4." [1] "which_are_included: date4 is included in column id." [1] "which_are_included: num1 is included in column num3." [1] "which_are_included: num3 is included in column id." [1] "which_are_included: num2 is included in column id." [1] "which_are_included: fnlwgt is included in column id." [1] "which_are_included: constant is included in column sex." [1] "which_are_included: sex is included in column fnlwgt." [1] "which_are_included: income is included in column id." [1] "which_are_included: race is included in column fnlwgt." [1] "which_are_included: relationship is included in column id." [1] "which_are_included: type_employer is included in column fnlwgt." [1] "which_are_included: marital is included in column id." [1] "which_are_included: occupation is included in column id." [1] "which_are_included: education is included in column education_num." [1] "which_are_included: education_num is included in column id." [1] "which_are_included: capital_gain is included in column fnlwgt." [1] "which_are_included: capital_loss is included in column fnlwgt." [1] "which_are_included: country is included in column fnlwgt." [1] "which_are_included: hr_per_week is included in column id." [1] "which_are_included: age is included in column id." [1] "which_are_included: mail is included in column id." [1] "which_are_included: date2 is included in column id." [1] "which_are_included: date1 is included in column id." [1] "which_are_included: date3 is included in column date4." [1] "which_are_included: date4 is included in column id." [1] "which_are_included: num1 is included in column num3." [1] "which_are_included: num3 is included in column id." [1] "which_are_included: num2 is included in column id." [1] "which_are_included: fnlwgt is included in column id." [ FAIL 0 | WARN 0 | SKIP 0 | PASS 331 ] > > proc.time() user system elapsed 12.51 1.39 14.39