์ƒ์„ธ ์ปจํ…์ธ 

๋ณธ๋ฌธ ์ œ๋ชฉ

[TIL] AWS Athena์—์„œ Json ๊ฐ’ ์ถ”์ถœ 220927

CLOUD/AWS

by ์—์Šคํ”„๋ฆฌํ„ฐ 2022. 9. 27. 23:05

๋ณธ๋ฌธ

728x90
๐Ÿ’ก Today I Learned ์š”์•ฝ (18ํšŒ์ฐจ)
- AWS Athena์— json ํ˜•ํƒœ๋กœ ์ €์žฅ๋œ ๊ฐ’์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋‹ค
- ์‚ฌ์šฉํ•จ์ˆ˜๋Š” json_extract ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ€๋Šฅ

 

AWS Athena์—์„œ Json ๊ฐ’ ์ถ”์ถœ

DB์— Json ํ˜•ํƒœ๋กœ ๊ฐ’์ด ์ €์žฅ๋˜์–ด ์žˆ์„ ๊ฒฝ์šฐ, Json์„ parsing ํ•ด์„œ ์จ์•ผ ํ•˜๋Š”๋ฐ AWS Athena์˜ ๊ฒฝ์šฐ json_extract ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด์„œ key ๊ธฐ์ค€ ๊ฐ’์„ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ณ  ์žˆ๋‹ค.

WITH dataset AS (
  SELECT '{"name": "Smith",
           "org": "Tech",
           "projects": [{"name":"project1", "completed":false},
           {"name":"project2", "completed":true}]}'
    AS blob
)
SELECT
  json_extract(blob, '$.name') AS name,
  json_extract(blob, '$.projects') AS projects
FROM dataset

AWS ์˜ˆ์ œ๋ฅผ ๊ทธ๋Œ€๋กœ ๊ฐ€์ง€๊ณ  ์™€๋ดค๋Š”๋ฐ $.key ๊ฐ’ ํ˜•ํƒœ๋กœ ์ถ”์ถœ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค.

์ผ๋ฐ˜ DB ํ˜•ํƒœ์˜ ๊ฒฝ์šฐ ์•„๋ž˜์ฒ˜๋Ÿผ ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. event_json ์นผ๋Ÿผ์˜ name ๊ฐ’๊ณผ content ๊ฐ’์„ ์ถ”์ถœํ•˜๋Š” ์˜ˆ์‹œ์ด๋‹ค

select
date_value
, json_extract(event_json, '$.name')
, json_extract_scalar(event_json, '$.content')
from database.json_inserted
where createdat_month >= '2022-08'
limit 1000
;

 


#aws ๊ด€๋ จ ์ž‘์„ฑ ๋ฐ ๋ฆฌ์„œ์น˜ ๋‚ด์—ญ:

ํƒœ๊ทธ

๊ด€๋ จ๊ธ€ ๋”๋ณด๊ธฐ

๋Œ“๊ธ€ ์˜์—ญ