카테고리 없음

스파르타 AI-8기 TIL(9/26)

kimjunki-8 2024. 9. 26. 16:11

What if there is specific data in the column(Null or Not Given)

1. Just exclude data.

select restaurant_name,

avg(rating) avg_rating,

avg(if(ratiing<>'Not given', rating, null))

from food_orders fo

if(rating<>'Not given', rating, null -> if there is 'Not Given' data on the column, change it to null(means there is no data.), or just show the data itself(name of table = rating)

 

or

 

can use "is not null"

select a.order_id,

a.customer_id,

a.restaurant_name,

a.price,

b.name,

b.age,

b.gender

from food_orders a left join customers b on a.customer_id=b.customer_id

where b.customer_id is not null

is not null -> means to call all the data that doesn't contain null from b.customer.id

 

or

 

can replace the data with another data

coalesce(data table, replacement data)

select a.order_id,

a.customer_id,

a.restaurant_name,

a.price,

b.name,

b.age,

coalesce(b.age, 20) "null 제거",

b.gender

from food_orders a left join customers b on a.customer_id=b.customer_id

where b.age is null

 

coalesce -> means if there is a null data, it will replace to 20

 

 

2. What if there is an abnormal data?

1. set the exact data range by IF statement

As I see, there is abnormal data in there(like age 6, or more than 80) so, I can change it with IF

Here, If there is an age over 80, it will be changed to 80 or lower than 15, it will be changed to 15

 

Pivot table 

-> Aggregating data by two or more criteria means displaying it in an easily viewable arrangement.

Example.

select restaurant_name,

max(if(hh='15', cnt_order, 0)) "15",

max(if(hh='16', cnt_order, 0)) "16",

max(if(hh='17', cnt_order, 0)) "17",

max(if(hh='18', cnt_order, 0)) "18",

max(if(hh='19', cnt_order, 0)) "19",

max(if(hh='20', cnt_order, 0)) "20"

from

(

select a.restaurant_name,

substring(b.time, 1, 2) hh,

count(1) cnt_order

from food_orders a inner join payments b on a.order_id=b.order_id

where substring(b.time, 1, 2) between 15 and 20

group by 1, 2

) a

group by 1

order by 7 desc

 

From here,

max(if(hh='15', cnt_order, 0)) "15",

max(if(hh='16', cnt_order, 0)) "16",

max(if(hh='17', cnt_order, 0)) "17",

max(if(hh='18', cnt_order, 0)) "18",

max(if(hh='19', cnt_order, 0)) "19",

max(if(hh='20', cnt_order, 0)) "20"

by max-> just consider it as one of the frames to make a column. 

 

Window function

frame->

1. window_function(argument) over (partition by (group standard column) order by (sort by)

1.1 Rank -> the name itself. it will rank certain columns

 

How to use rank?

select cuisine_type,

restaurant_name,

cnt_order,

rank() over (PARTITION by cuisine_type order by cnt_order desc) ranking

from

(

select cuisine_type,

restaurant_name,

count(1) cnt_order #based on the first statement(cuisine_type) it will count all the columns

from food_orders fo

group by 1, 2

) a

rank() over (partition by cuisine_type order by cnt_order descranking

1. rank() over -> these are pair 

2. after that I must write 2 things 2.1 partition by 2.2 order by

2.1 partition by -> It will separate the data based on the name of the table

2.2 order by -> after partition by, must specify how to order the table

Based on this picture, after the American column, It will restart ranking on Chinese columns.

Because these data have been partitioned by partition by

 

Date function(date)

Turn the column into a date-type

select date,

           date(date) -> remember date is a function

from payments

format -> format the data type

select date(date) date_type,
       date_format(date(date), '%Y') "년", -> format the date(date) data to %Y form(Y -> year)
       date_format(date(date), '%m') "월", -> (m -> month)
       date_format(date(date), '%d') "일", -> (d -> day)
       date_format(date(date), '%w') "요일" (w -> weekend)

from payments

remember!

from the data,

0 -> Sunday

1 -> Monday
2 -> Tuesday
3 -> Wednesday

4 -> Thursday

5 -> Friday

6 -> Saturday

 

H.W

1. Create a pivot view of the number of orders by food type and age (age is between 10 and 59 years old)

select cuisine_type,

max(if(age = 10, cnt_orders, 0)) '10대',

max(if(age = 20, cnt_orders, 0)) '20대',

max(if(age = 30, cnt_orders, 0)) '30대',

max(if(age = 40, cnt_orders, 0)) '40대',

max(if(age = 50, cnt_orders, 0)) '50대'

from

(

select cuisine_type,

case when age between 10 and 19 then 10

when age between 20 and 29 then 20

when age between 30 and 39 then 30

when age between 40 and 49 then 40

else '50대' end age,

count(1) cnt_orders

from food_orders f inner join customers c on f.customer_id = c.customer_id

group by 1, 2

) a

group by 1

Done. Good night.