The StarThinker project will no longer receive new solution contributions from the Google team.
Please read the full StarThinker Open Source Support Ends At Google article for more details.



All

analytics Census Data Correlation

Correlate another table with US Census data. Expands a data set dimensions by finding population segments that correlate with the master table.



lock_openGet Access listGit Hub sourcePython menu_bookColab airAirflow



Instructions

Pre-requisite is Census Normalize, run that at least once.
Specify JOIN, PASS, SUM, and CORRELATE columns to build the correlation query.
Define the DATASET and TABLE for the joinable source. Can be a view.
Choose the significance level. More significance usually means more NULL results, balance quantity and quality using this value.
Specify where to write the results.
IMPORTANT:** If you use VIEWS, you will have to delete them manually if the recipe changes.

Details

Open Source YES
Age July 6, 2020 (2 years, 6 months)
Authors kenjora@google.com
Shedule Days Configured by user.
Shedule Hours Configured by user.
[
    {
        "census": {
            "auth": {
                "field": {
                    "name": "auth",
                    "kind": "authentication",
                    "order": 0,
                    "default": "service",
                    "description": "Credentials used for writing data."
                }
            },
            "correlate": {
                "join": {
                    "field": {
                        "name": "join",
                        "kind": "string",
                        "order": 1,
                        "default": "",
                        "description": "Name of column to join on, must match Census Geo_Id column."
                    }
                },
                "pass": {
                    "field": {
                        "name": "pass",
                        "kind": "string_list",
                        "order": 2,
                        "default": [],
                        "description": "Comma seperated list of columns to pass through."
                    }
                },
                "sum": {
                    "field": {
                        "name": "sum",
                        "kind": "string_list",
                        "order": 3,
                        "default": [],
                        "description": "Comma seperated list of columns to sum, optional."
                    }
                },
                "correlate": {
                    "field": {
                        "name": "correlate",
                        "kind": "string_list",
                        "order": 4,
                        "default": [],
                        "description": "Comma seperated list of percentage columns to correlate."
                    }
                },
                "dataset": {
                    "field": {
                        "name": "from_dataset",
                        "kind": "string",
                        "order": 5,
                        "default": "",
                        "description": "Existing BigQuery dataset."
                    }
                },
                "table": {
                    "field": {
                        "name": "from_table",
                        "kind": "string",
                        "order": 6,
                        "default": "",
                        "description": "Table to use as join data."
                    }
                },
                "significance": {
                    "field": {
                        "name": "significance",
                        "kind": "choice",
                        "order": 7,
                        "default": "80",
                        "description": "Select level of significance to test.",
                        "choices": [
                            "80",
                            "90",
                            "98",
                            "99",
                            "99.5",
                            "99.95"
                        ]
                    }
                }
            },
            "to": {
                "dataset": {
                    "field": {
                        "name": "to_dataset",
                        "kind": "string",
                        "order": 9,
                        "default": "",
                        "description": "Existing BigQuery dataset."
                    }
                },
                "type": {
                    "field": {
                        "name": "type",
                        "kind": "choice",
                        "order": 10,
                        "default": "table",
                        "description": "Write Census_Percent as table or view.",
                        "choices": [
                            "table",
                            "view"
                        ]
                    }
                }
            }
        }
    }
]


Run This Workflow In Minutes On Google Cloud

Everything from a quick Google Cloud UI to reference developer code for your team in one GitHub repository.

Deployment Stepslaunch Developer Guidebuild UI How Tolaptop