Overview

Dataset statistics

Number of variables5
Number of observations85
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory44.6 B

Variable types

Numeric3
Categorical2

Dataset

Description회계연도,분야코드,분야명,정렬순서,인증코드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15711/S/1/datasetView.do

Alerts

인증코드 is highly overall correlated with 분야코드 and 2 other fieldsHigh correlation
분야명 is highly overall correlated with 분야코드 and 2 other fieldsHigh correlation
분야코드 is highly overall correlated with 정렬순서 and 2 other fieldsHigh correlation
정렬순서 is highly overall correlated with 분야코드 and 2 other fieldsHigh correlation
분야코드 has 2 (2.4%) zerosZeros
정렬순서 has 2 (2.4%) zerosZeros

Reproduction

Analysis started2024-05-11 08:35:59.822279
Analysis finished2024-05-11 08:36:00.874269
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Real number (ℝ)

Distinct8
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.0824
Minimum2013
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size897.0 B
2024-05-11T17:36:00.919877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2013
5-th percentile2013
Q12015
median2017
Q32019
95-th percentile2021
Maximum2021
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7351842
Coefficient of variation (CV)0.0013560102
Kurtosis-1.2310973
Mean2017.0824
Median Absolute Deviation (MAD)2
Skewness0.061781638
Sum171452
Variance7.4812325
MonotonicityDecreasing
2024-05-11T17:36:01.043829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2021 18
21.2%
2019 10
11.8%
2018 10
11.8%
2017 10
11.8%
2014 10
11.8%
2013 10
11.8%
2016 9
10.6%
2015 8
9.4%
ValueCountFrequency (%)
2013 10
11.8%
2014 10
11.8%
2015 8
9.4%
2016 9
10.6%
2017 10
11.8%
2018 10
11.8%
2019 10
11.8%
2021 18
21.2%
ValueCountFrequency (%)
2021 18
21.2%
2019 10
11.8%
2018 10
11.8%
2017 10
11.8%
2016 9
10.6%
2015 8
9.4%
2014 10
11.8%
2013 10
11.8%

분야코드
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct19
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean596.47059
Minimum0
Maximum1800
Zeros2
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size897.0 B
2024-05-11T17:36:01.157928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile100
Q1300
median600
Q3800
95-th percentile1380
Maximum1800
Range1800
Interquartile range (IQR)500

Descriptive statistics

Standard deviation393.23053
Coefficient of variation (CV)0.65926223
Kurtosis0.79992038
Mean596.47059
Median Absolute Deviation (MAD)300
Skewness0.88352114
Sum50700
Variance154630.25
MonotonicityNot monotonic
2024-05-11T17:36:01.246934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
100 8
9.4%
300 8
9.4%
400 8
9.4%
500 8
9.4%
600 8
9.4%
700 8
9.4%
800 8
9.4%
200 8
9.4%
900 7
8.2%
1000 4
 
4.7%
Other values (9) 10
11.8%
ValueCountFrequency (%)
0 2
 
2.4%
100 8
9.4%
200 8
9.4%
300 8
9.4%
400 8
9.4%
500 8
9.4%
600 8
9.4%
700 8
9.4%
800 8
9.4%
900 7
8.2%
ValueCountFrequency (%)
1800 1
 
1.2%
1700 1
 
1.2%
1600 1
 
1.2%
1500 1
 
1.2%
1400 1
 
1.2%
1300 1
 
1.2%
1200 1
 
1.2%
1100 1
 
1.2%
1000 4
4.7%
900 7
8.2%

분야명
Categorical

HIGH CORRELATION 

Distinct37
Distinct (%)43.5%
Missing0
Missing (%)0.0%
Memory size812.0 B
공원
환경
교통주택
 
4
복지
 
4
도시안전
 
4
Other values (32)
59 

Length

Max length6
Median length5
Mean length3.1058824
Min length2

Unique

Unique17 ?
Unique (%)20.0%

Sample

1st row민주서울
2nd row여성
3rd row복지
4th row환경
5th row시민건강

Common Values

ValueCountFrequency (%)
공원 7
 
8.2%
환경 7
 
8.2%
교통주택 4
 
4.7%
복지 4
 
4.7%
도시안전 4
 
4.7%
교통 4
 
4.7%
주택 4
 
4.7%
여성보육 3
 
3.5%
문화관광 3
 
3.5%
경제?일자리 3
 
3.5%
Other values (27) 42
49.4%

Length

2024-05-11T17:36:01.368128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공원 7
 
8.2%
환경 7
 
8.2%
교통주택 4
 
4.7%
복지 4
 
4.7%
도시안전 4
 
4.7%
교통 4
 
4.7%
주택 4
 
4.7%
여성보육 3
 
3.5%
경제?일자리 3
 
3.5%
문화체육 3
 
3.5%
Other values (27) 42
49.4%

정렬순서
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5882353
Minimum0
Maximum20
Zeros2
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size897.0 B
2024-05-11T17:36:01.471393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q39
95-th percentile15.8
Maximum20
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.4835523
Coefficient of variation (CV)0.68053918
Kurtosis0.44585145
Mean6.5882353
Median Absolute Deviation (MAD)3
Skewness0.86420479
Sum560
Variance20.102241
MonotonicityNot monotonic
2024-05-11T17:36:01.593294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4 8
9.4%
5 8
9.4%
8 8
9.4%
2 8
9.4%
1 7
8.2%
3 7
8.2%
6 7
8.2%
10 7
8.2%
9 6
7.1%
7 5
 
5.9%
Other values (10) 14
16.5%
ValueCountFrequency (%)
0 2
 
2.4%
1 7
8.2%
2 8
9.4%
3 7
8.2%
4 8
9.4%
5 8
9.4%
6 7
8.2%
7 5
5.9%
8 8
9.4%
9 6
7.1%
ValueCountFrequency (%)
20 1
 
1.2%
18 1
 
1.2%
17 2
 
2.4%
16 1
 
1.2%
15 1
 
1.2%
14 2
 
2.4%
13 1
 
1.2%
12 2
 
2.4%
11 1
 
1.2%
10 7
8.2%

인증코드
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
57 
ROLE_BUILD
 
3
ROLE_ECONOMY
 
3
ROLE_PARK
 
3
ROLE_RESIDENCE
 
3
Other values (6)
16 

Length

Max length16
Median length4
Mean length6.8235294
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 57
67.1%
ROLE_BUILD 3
 
3.5%
ROLE_ECONOMY 3
 
3.5%
ROLE_PARK 3
 
3.5%
ROLE_RESIDENCE 3
 
3.5%
ROLE_CULTURE 3
 
3.5%
ROLE_HEALTH 3
 
3.5%
ROLE_CHILDCARE 3
 
3.5%
ROLE_ENVIRONMENT 3
 
3.5%
ROLE_COMMITTEE 2
 
2.4%

Length

2024-05-11T17:36:01.702570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 57
67.1%
role_build 3
 
3.5%
role_economy 3
 
3.5%
role_park 3
 
3.5%
role_residence 3
 
3.5%
role_culture 3
 
3.5%
role_health 3
 
3.5%
role_childcare 3
 
3.5%
role_environment 3
 
3.5%
role_committee 2
 
2.4%

Interactions

2024-05-11T17:36:00.455452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:35:59.988730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.225229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.554380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.067325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.299662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.639954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.143066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:36:00.367674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T17:36:01.772993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도분야코드분야명정렬순서인증코드
회계연도1.0000.0000.0000.0000.000
분야코드0.0001.0000.9730.8731.000
분야명0.0000.9731.0000.9341.000
정렬순서0.0000.8730.9341.0000.967
인증코드0.0001.0001.0000.9671.000
2024-05-11T17:36:01.849591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증코드분야명
인증코드1.0001.000
분야명1.0001.000
2024-05-11T17:36:01.918106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도분야코드정렬순서분야명인증코드
회계연도1.0000.3330.2980.0000.000
분야코드0.3331.0000.5610.6590.885
정렬순서0.2980.5611.0000.5590.688
분야명0.0000.6590.5591.0001.000
인증코드0.0000.8850.6881.0001.000

Missing values

2024-05-11T17:36:00.752178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T17:36:00.835656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도분야코드분야명정렬순서인증코드
02021100민주서울1<NA>
12021200여성2<NA>
22021300복지3<NA>
32021400환경4<NA>
42021500시민건강5<NA>
52021600노동민생6<NA>
62021700안전7<NA>
72021800교통8<NA>
82021900문화9<NA>
920211000관광체육10<NA>
회계연도분야코드분야명정렬순서인증코드
7520130운영위0ROLE_COMMITTEE
762013800경제산업1ROLE_ECONOMY
772013600환경2ROLE_ENVIRONMENT
782013700공원3ROLE_PARK
792013300문화체육4ROLE_CULTURE
802013100여성보육5ROLE_CHILDCARE
812013400보건복지6ROLE_HEALTH
822013500건설7ROLE_BUILD
832013200교통주택8ROLE_RESIDENCE
842013900백서위원9ROLE_WHITEPAPER