Overview

Dataset statistics

Number of variables6
Number of observations894
Missing cells184
Missing cells (%)3.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory43.8 KiB
Average record size in memory50.1 B

Variable types

Numeric2
Boolean3
Text1

Dataset

Description경기도 경기통계시스템 기관정보
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=E000SKXSHDMI8GTE9TRG33424815&infSeq=1

Alerts

통계작성여부 has constant value ""Constant
표준통계작성기관여부 has constant value ""Constant
조직번호 is highly overall correlated with 상위조직번호 and 1 other fieldsHigh correlation
상위조직번호 is highly overall correlated with 조직번호 and 1 other fieldsHigh correlation
부서여부 is highly overall correlated with 조직번호 and 1 other fieldsHigh correlation
상위조직번호 has 184 (20.6%) missing valuesMissing
조직번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 22:14:09.077247
Analysis finished2023-12-10 22:14:09.746964
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

조직번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct894
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean212491.27
Minimum0
Maximum994000
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size8.0 KiB
2023-12-11T07:14:09.814331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile190.15
Q1107010.5
median154002.5
Q3331012.75
95-th percentile395000.35
Maximum994000
Range994000
Interquartile range (IQR)224002.25

Descriptive statistics

Standard deviation195840.25
Coefficient of variation (CV)0.92163901
Kurtosis4.8638492
Mean212491.27
Median Absolute Deviation (MAD)153061
Skewness1.8066324
Sum1.899672 × 108
Variance3.8353403 × 1010
MonotonicityNot monotonic
2023-12-11T07:14:09.943989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201002 1
 
0.1%
390000 1
 
0.1%
385001 1
 
0.1%
386000 1
 
0.1%
386001 1
 
0.1%
387000 1
 
0.1%
387001 1
 
0.1%
387002 1
 
0.1%
388000 1
 
0.1%
388001 1
 
0.1%
Other values (884) 884
98.9%
ValueCountFrequency (%)
0 1
0.1%
101 1
0.1%
102 1
0.1%
103 1
0.1%
105 1
0.1%
106 1
0.1%
109 1
0.1%
110 1
0.1%
111 1
0.1%
112 1
0.1%
ValueCountFrequency (%)
994000 1
0.1%
993000 1
0.1%
989000 1
0.1%
987000 1
0.1%
986000 1
0.1%
985000 1
0.1%
979000 1
0.1%
971001 1
0.1%
971000 1
0.1%
969001 1
0.1%

상위조직번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct169
Distinct (%)23.8%
Missing184
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean267.46197
Minimum101
Maximum994
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 KiB
2023-12-11T07:14:10.320044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile106
Q1118.5
median231
Q3345
95-th percentile621
Maximum994
Range893
Interquartile range (IQR)226.5

Descriptive statistics

Standard deviation183.30984
Coefficient of variation (CV)0.68536786
Kurtosis6.1934205
Mean267.46197
Median Absolute Deviation (MAD)113
Skewness2.2504088
Sum189898
Variance33602.497
MonotonicityNot monotonic
2023-12-11T07:14:10.498778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
117 32
 
3.6%
116 24
 
2.7%
361 22
 
2.5%
115 19
 
2.1%
118 19
 
2.1%
301 18
 
2.0%
101 18
 
2.0%
123 18
 
2.0%
114 17
 
1.9%
345 17
 
1.9%
Other values (159) 506
56.6%
(Missing) 184
 
20.6%
ValueCountFrequency (%)
101 18
2.0%
102 4
 
0.4%
105 3
 
0.3%
106 15
1.7%
110 11
1.2%
111 5
 
0.6%
112 3
 
0.3%
113 8
0.9%
114 17
1.9%
115 19
2.1%
ValueCountFrequency (%)
994 1
0.1%
993 1
0.1%
989 1
0.1%
987 1
0.1%
986 1
0.1%
985 1
0.1%
979 1
0.1%
971 2
0.2%
969 2
0.2%
967 1
0.1%

부서여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
True
710 
False
184 
ValueCountFrequency (%)
True 710
79.4%
False 184
 
20.6%
2023-12-11T07:14:10.600134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct704
Distinct (%)78.7%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2023-12-11T07:14:10.812923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length8.5693512
Min length2

Characters and Unicode

Total characters7661
Distinct characters291
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique694 ?
Unique (%)77.6%

Sample

1st row교통국 교통계획과
2nd row교통국 전산정보센터
3rd row보건복지국 사회복지과
4th row보건사회국 사회과
5th row복지건강국 보건정책과
ValueCountFrequency (%)
기타 169
 
12.0%
기획관리실 19
 
1.3%
경제통계국 16
 
1.1%
조사부 16
 
1.1%
기획관실 10
 
0.7%
교육정보화과 8
 
0.6%
통계팀 7
 
0.5%
경영조사팀 7
 
0.5%
조사통계팀 7
 
0.5%
정보화담당관실 7
 
0.5%
Other values (918) 1147
81.2%
2023-12-11T07:14:11.178587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
524
 
6.8%
336
 
4.4%
284
 
3.7%
267
 
3.5%
241
 
3.1%
228
 
3.0%
190
 
2.5%
182
 
2.4%
171
 
2.2%
170
 
2.2%
Other values (281) 5068
66.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7088
92.5%
Space Separator 528
 
6.9%
Decimal Number 19
 
0.2%
Uppercase Letter 14
 
0.2%
Other Punctuation 4
 
0.1%
Close Punctuation 3
 
< 0.1%
Open Punctuation 3
 
< 0.1%
Lowercase Letter 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
336
 
4.7%
284
 
4.0%
267
 
3.8%
241
 
3.4%
228
 
3.2%
190
 
2.7%
182
 
2.6%
171
 
2.4%
170
 
2.4%
161
 
2.3%
Other values (260) 4858
68.5%
Uppercase Letter
ValueCountFrequency (%)
T 3
21.4%
D 2
14.3%
R 2
14.3%
H 2
14.3%
K 2
14.3%
G 1
 
7.1%
B 1
 
7.1%
I 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
1 7
36.8%
2 6
31.6%
0 5
26.3%
1
 
5.3%
Other Punctuation
ValueCountFrequency (%)
/ 2
50.0%
& 1
25.0%
· 1
25.0%
Space Separator
ValueCountFrequency (%)
524
99.2%
  4
 
0.8%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7088
92.5%
Common 558
 
7.3%
Latin 15
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
336
 
4.7%
284
 
4.0%
267
 
3.8%
241
 
3.4%
228
 
3.2%
190
 
2.7%
182
 
2.6%
171
 
2.4%
170
 
2.4%
161
 
2.3%
Other values (260) 4858
68.5%
Common
ValueCountFrequency (%)
524
93.9%
1 7
 
1.3%
2 6
 
1.1%
0 5
 
0.9%
  4
 
0.7%
) 3
 
0.5%
( 3
 
0.5%
/ 2
 
0.4%
1
 
0.2%
& 1
 
0.2%
Other values (2) 2
 
0.4%
Latin
ValueCountFrequency (%)
T 3
20.0%
D 2
13.3%
R 2
13.3%
H 2
13.3%
K 2
13.3%
G 1
 
6.7%
B 1
 
6.7%
I 1
 
6.7%
e 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7088
92.5%
ASCII 567
 
7.4%
None 6
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
524
92.4%
1 7
 
1.2%
2 6
 
1.1%
0 5
 
0.9%
T 3
 
0.5%
) 3
 
0.5%
( 3
 
0.5%
/ 2
 
0.4%
D 2
 
0.4%
R 2
 
0.4%
Other values (8) 10
 
1.8%
Hangul
ValueCountFrequency (%)
336
 
4.7%
284
 
4.0%
267
 
3.8%
241
 
3.4%
228
 
3.2%
190
 
2.7%
182
 
2.6%
171
 
2.4%
170
 
2.4%
161
 
2.3%
Other values (260) 4858
68.5%
None
ValueCountFrequency (%)
  4
66.7%
1
 
16.7%
· 1
 
16.7%

통계작성여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
False
894 
ValueCountFrequency (%)
False 894
100.0%
2023-12-11T07:14:11.291855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
False
894 
ValueCountFrequency (%)
False 894
100.0%
2023-12-11T07:14:11.365427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-11T07:14:09.450042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:14:09.291017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:14:09.529557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:14:09.375472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:14:11.419450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조직번호상위조직번호부서여부
조직번호1.0000.9971.000
상위조직번호0.9971.000NaN
부서여부1.000NaN1.000
2023-12-11T07:14:11.498962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조직번호상위조직번호부서여부
조직번호1.0001.0000.997
상위조직번호1.0001.0001.000
부서여부0.9971.0001.000

Missing values

2023-12-11T07:14:09.627466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:14:09.711770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

조직번호상위조직번호부서여부조직명통계작성여부표준통계작성기관여부
0201002201Y교통국 교통계획과NN
1201003201Y교통국 전산정보센터NN
2201004201Y보건복지국 사회복지과NN
3201005201Y보건사회국 사회과NN
4201006201Y복지건강국 보건정책과NN
5201007201Y복지건강국 장애인복지과NN
6201008201Y복지여성국 보건과NN
7201009201Y송파구 기획예산과NN
8201010201Y전산정보담당관실NN
9201011201Y정보화기획단장 정보화기획담당관실NN
조직번호상위조직번호부서여부조직명통계작성여부표준통계작성기관여부
884132002132Y교통관리관실 교통안전과NN
885133000133Y기타NN
886133001133Y부동산납세관리국 종합부동산세과NN
887134000134Y기타NN
888134001134Y통관지원국 통관기획과NN
889135000135Y기타NN
890135001135Y총무부  기획과NN
891136000136Y기타NN
892136001136Y사유림자원국 산림소득과NN
893136002136Y사유림지원국 산림소득과NN