Data Science 100 Knocks (Structured Data Processing) – SQL part4 (Q61 to Q80)

Articles in English
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Commentary : 

This SQL code selects a random subset of 10 customers from the customer table. The RANDOM() function generates a random value between 0 and 1 for each row in the table, and the WHERE clause filters the rows based on whether the generated value is less than or equal to 0.01 (1% chance of being selected).

This means that each row has an equal chance of being selected, regardless of any other criteria or order in the table. The LIMIT clause is used to restrict the output to the first 10 rows that meet the condition.
 
 
 
Commentary :

This SQL code is selecting a random sample of customers from a table called "customer" and then calculating the number of customers in the sample that fall into each gender category.

Here is a breakdown of what each part of the code is doing:

WITH clause: This clause creates two temporary tables, "customer_random" and "customer_rownum," that will be used in the subsequent SELECT statement.

customer_random table: This table selects a random sample of customers from the "customer" table and assigns each customer a "customer_r" value based on their gender. The "cnt" column counts the total number of customers in each gender group.

customer_rownum table: This table adds a row number ("rn") to each customer in the "customer_random" table, partitioned by gender.

SELECT statement: This statement counts the number of customers in each gender group whose row number is less than or equal to 10% of the total number of customers in that gender group (i.e., the top 10% of customers by row number). The result is a table with two columns, "gender_cd" and "customer_num", where "gender_cd" is the gender code and "customer_num" is the number of customers in that gender group that fall into the top 10% by row number.

Overall, this code is useful for selecting a random sample of customers from a larger dataset and then analyzing their gender distribution.
 
 
 
 
 
 

 

Commentary :

This code is using the SQL programming language to create a new table called "product_1" by selecting only the rows from the "product" table where both the "unit_price" and "unit_cost" columns are not NULL.

The first line of the code, "DROP TABLE IF EXISTS product_1;", is a safety measure that checks whether the "product_1" table already exists and drops it if it does, to avoid any conflicts with the new table being created.

The second line, "CREATE TABLE product_1 AS (...);", specifies that a new table called "product_1" should be created, and that its contents should be based on the results of a SELECT statement.

The SELECT statement selects all columns from the "product" table, but includes a WHERE clause that filters out any rows where either the "unit_price" or "unit_cost" column is NULL.

Overall, this code creates a new table that contains only the rows from the original "product" table that have valid data for both the "unit_price" and "unit_cost" columns. This is a common way to clean up data and prepare it for analysis or modeling, as missing data can cause issues with some analytical techniques.
 
Data Science 100 Knocks (Structured Data Processing) - SQL
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing) - SQL part1 (Q1 to Q20)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing) - SQL part2 (Q21 to Q40)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing) - SQL part3 (Q41 to Q60)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing) - SQL part4 (Q61 to Q80)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing) - SQL part5 (Q81 to Q100)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice SQL, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.
Data Science 100 Knocks (Structured Data Processing)
This is an ipynb file originally created by The Data Scientist Society(データサイエンティスト協会スキル定義委員) and translated from Japanese to English by DeepL. The reason I updated this file is to spread this practice, which is useful for everyone who wants to practice Python, SQL, R, from beginners to advanced engineers. Since this data is created for Japanese, you may face language problems when practicing. But do not worry, it will not affect much.

Comment