Overview

Dataset statistics

Number of variables13
Number of observations667
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory67.9 KiB
Average record size in memory104.2 B

Variable types

Categorical13

Dataset

Description시도명,시군구명,학교급,학교명,교사 내부 관제 (대),교사 외부 관제 (대),총 합계 (대),화소 50만 이하 (대),화소 51~100만 미만 (대),화소 100~130만 미만 (대),화소 130~200만 미만 (대),화소 200만 이상 (대),데이터기준일자
URLhttps://data.gwd.go.kr//dataset/view?infId=OA-12982

Alerts

학교명 has a high cardinality: 653 distinct values High cardinality
교사 외부 관제 (대) has a high cardinality: 71 distinct values High cardinality
총 합계 (대) has a high cardinality: 74 distinct values High cardinality
화소 200만 이상 (대) has a high cardinality: 74 distinct values High cardinality
화소 51~100만 미만 (대) is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
화소 200만 이상 (대) is highly correlated with 화소 51~100만 미만 (대) and 6 other fieldsHigh correlation
데이터기준일자 is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
화소 50만 이하 (대) is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
학교급 is highly correlated with 화소 51~100만 미만 (대) and 5 other fieldsHigh correlation
교사 내부 관제 (대) is highly correlated with 화소 51~100만 미만 (대) and 5 other fieldsHigh correlation
시도명 is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
화소 100~130만 미만 (대) is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
총 합계 (대) is highly correlated with 화소 51~100만 미만 (대) and 6 other fieldsHigh correlation
화소 130~200만 미만 (대) is highly correlated with 화소 200만 이상 (대) and 5 other fieldsHigh correlation
교사 외부 관제 (대) is highly correlated with 화소 51~100만 미만 (대) and 5 other fieldsHigh correlation
시군구명 is highly correlated with 화소 51~100만 미만 (대) and 5 other fieldsHigh correlation

Reproduction

Analysis started2022-08-11 14:11:37.574564
Analysis finished2022-08-11 14:11:41.768762
Duration4.19 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

시도명
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
강원도
666 
SIDO_NM
 
1

Length

Max length7
Median length3
Mean length3.005997001
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowSIDO_NM
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
강원도666
99.9%
SIDO_NM1
 
0.1%

Length

2022-08-11T23:11:41.844977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:42.052786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
강원도666
99.9%
sido_nm1
 
0.1%

시군구명
Categorical

HIGH CORRELATION

Distinct20
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
원주시
90 
춘천시
78 
강릉시
62 
홍천군
49 
삼척시
40 
Other values (15)
348 

Length

Max length6
Median length3
Mean length3.005997001
Min length3

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st rowGOV_NM
2nd row화천군
3rd row화천군
4th row화천군
5th row화천군

Common Values

ValueCountFrequency (%)
원주시90
13.5%
춘천시78
11.7%
강릉시62
 
9.3%
홍천군49
 
7.3%
삼척시40
 
6.0%
정선군35
 
5.2%
영월군34
 
5.1%
횡성군34
 
5.1%
평창군32
 
4.8%
인제군27
 
4.0%
Other values (10)186
27.9%

Length

2022-08-11T23:11:42.214648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
원주시90
13.5%
춘천시78
11.7%
강릉시62
 
9.3%
홍천군49
 
7.3%
삼척시41
 
6.1%
정선군35
 
5.2%
영월군34
 
5.1%
횡성군34
 
5.1%
평창군32
 
4.8%
동해시27
 
4.0%
Other values (9)185
27.7%

학교급
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
379 
163 
116 
 
8
SCHOOL_GRADE
 
1

Length

Max length12
Median length1
Mean length1.016491754
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowSCHOOL_GRADE
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
379
56.8%
163
24.4%
116
 
17.4%
8
 
1.2%
SCHOOL_GRADE1
 
0.1%

Length

2022-08-11T23:11:42.536366image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:42.656707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
379
56.8%
163
24.4%
116
 
17.4%
8
 
1.2%
school_grade1
 
0.1%

학교명
Categorical

HIGH CARDINALITY

Distinct653
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
중앙초등학교
 
4
교동초등학교
 
4
남산초등학교
 
3
속초초등학교
 
2
조양초등학교
 
2
Other values (648)
652 

Length

Max length10
Median length6
Mean length6.125937031
Min length4

Unique

Unique644 ?
Unique (%)96.6%

Sample

1st rowSCHOOL_NM
2nd row화천중학교
3rd row상서중학교
4th row사내중학교
5th row간동중학교

Common Values

ValueCountFrequency (%)
중앙초등학교4
 
0.6%
교동초등학교4
 
0.6%
남산초등학교3
 
0.4%
속초초등학교2
 
0.3%
조양초등학교2
 
0.3%
반곡초등학교2
 
0.3%
북평초등학교2
 
0.3%
원당초등학교2
 
0.3%
신동초등학교2
 
0.3%
사내중학교1
 
0.1%
Other values (643)643
96.4%

Length

2022-08-11T23:11:42.759862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
중앙초등학교4
 
0.6%
교동초등학교4
 
0.6%
남산초등학교3
 
0.4%
기린초2
 
0.3%
속초초등학교2
 
0.3%
조양초등학교2
 
0.3%
반곡초등학교2
 
0.3%
북평초등학교2
 
0.3%
원당초등학교2
 
0.3%
신동초등학교2
 
0.3%
Other values (663)664
96.4%

교사 내부 관제 (대)
Categorical

HIGH CORRELATION

Distinct23
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
0
408 
1
56 
2
51 
3
44 
4
 
28
Other values (18)
80 

Length

Max length13
Median length1
Mean length1.044977511
Min length1

Unique

Unique8 ?
Unique (%)1.2%

Sample

1st rowINTERNAL_CCTV
2nd row4
3rd row0
4th row3
5th row0

Common Values

ValueCountFrequency (%)
0408
61.2%
156
 
8.4%
251
 
7.6%
344
 
6.6%
428
 
4.2%
817
 
2.5%
614
 
2.1%
514
 
2.1%
98
 
1.2%
78
 
1.2%
Other values (13)19
 
2.8%

Length

2022-08-11T23:11:42.867514image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0408
61.2%
156
 
8.4%
251
 
7.6%
344
 
6.6%
428
 
4.2%
817
 
2.5%
614
 
2.1%
514
 
2.1%
98
 
1.2%
78
 
1.2%
Other values (13)19
 
2.8%

교사 외부 관제 (대)
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct71
Distinct (%)10.6%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
9
51 
10
46 
11
 
44
16
 
42
8
 
42
Other values (66)
442 

Length

Max length13
Median length2
Mean length1.769115442
Min length1

Unique

Unique19 ?
Unique (%)2.8%

Sample

1st rowEXTERNAL_CCTV
2nd row9
3rd row15
4th row18
5th row10

Common Values

ValueCountFrequency (%)
951
 
7.6%
1046
 
6.9%
1144
 
6.6%
1642
 
6.3%
842
 
6.3%
1340
 
6.0%
1238
 
5.7%
1431
 
4.6%
720
 
3.0%
020
 
3.0%
Other values (61)293
43.9%

Length

2022-08-11T23:11:42.968163image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
951
 
7.6%
1046
 
6.9%
1144
 
6.6%
1642
 
6.3%
842
 
6.3%
1340
 
6.0%
1238
 
5.7%
1431
 
4.6%
020
 
3.0%
720
 
3.0%
Other values (61)293
43.9%

총 합계 (대)
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct74
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
16
50 
12
48 
9
 
39
13
 
36
14
 
36
Other values (69)
458 

Length

Max length10
Median length2
Mean length1.821589205
Min length1

Unique

Unique17 ?
Unique (%)2.5%

Sample

1st rowCCTV_TOTAL
2nd row13
3rd row15
4th row21
5th row10

Common Values

ValueCountFrequency (%)
1650
 
7.5%
1248
 
7.2%
939
 
5.8%
1336
 
5.4%
1436
 
5.4%
835
 
5.2%
1032
 
4.8%
1132
 
4.8%
1527
 
4.0%
020
 
3.0%
Other values (64)312
46.8%

Length

2022-08-11T23:11:43.102024image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1650
 
7.5%
1248
 
7.2%
939
 
5.8%
1436
 
5.4%
1336
 
5.4%
835
 
5.2%
1032
 
4.8%
1132
 
4.8%
1527
 
4.0%
020
 
3.0%
Other values (64)312
46.8%

화소 50만 이하 (대)
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
0
666 
LESS_THEN_P50
 
1

Length

Max length13
Median length1
Mean length1.017991004
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowLESS_THEN_P50
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0666
99.9%
LESS_THEN_P501
 
0.1%

Length

2022-08-11T23:11:43.226582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:43.323992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0666
99.9%
less_then_p501
 
0.1%

화소 51~100만 미만 (대)
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
0
666 
P51_P100
 
1

Length

Max length8
Median length1
Mean length1.010494753
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowP51_P100
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0666
99.9%
P51_P1001
 
0.1%

Length

2022-08-11T23:11:43.411363image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:43.522237image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0666
99.9%
p51_p1001
 
0.1%

화소 100~130만 미만 (대)
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
0
666 
P100_P130
 
1

Length

Max length9
Median length1
Mean length1.011994003
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowP100_P130
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0666
99.9%
P100_P1301
 
0.1%

Length

2022-08-11T23:11:43.606956image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:43.718283image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0666
99.9%
p100_p1301
 
0.1%

화소 130~200만 미만 (대)
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
0
666 
P130_P200
 
1

Length

Max length9
Median length1
Mean length1.011994003
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowP130_P200
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0666
99.9%
P130_P2001
 
0.1%

Length

2022-08-11T23:11:43.824523image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:43.948241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0666
99.9%
p130_p2001
 
0.1%

화소 200만 이상 (대)
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct74
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
16
50 
12
48 
9
 
39
13
 
36
14
 
36
Other values (69)
458 

Length

Max length14
Median length2
Mean length1.827586207
Min length1

Unique

Unique17 ?
Unique (%)2.5%

Sample

1st rowMORE_THEN_P200
2nd row13
3rd row15
4th row21
5th row10

Common Values

ValueCountFrequency (%)
1650
 
7.5%
1248
 
7.2%
939
 
5.8%
1336
 
5.4%
1436
 
5.4%
835
 
5.2%
1032
 
4.8%
1132
 
4.8%
1527
 
4.0%
020
 
3.0%
Other values (64)312
46.8%

Length

2022-08-11T23:11:44.052476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1650
 
7.5%
1248
 
7.2%
939
 
5.8%
1436
 
5.4%
1336
 
5.4%
835
 
5.2%
1032
 
4.8%
1132
 
4.8%
1527
 
4.0%
020
 
3.0%
Other values (64)312
46.8%

데이터기준일자
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
20200224
666 
STD_DT
 
1

Length

Max length8
Median length8
Mean length7.997001499
Min length6

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowSTD_DT
2nd row20200224
3rd row20200224
4th row20200224
5th row20200224

Common Values

ValueCountFrequency (%)
20200224666
99.9%
STD_DT1
 
0.1%

Length

2022-08-11T23:11:44.167255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-11T23:11:44.410678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
20200224666
99.9%
std_dt1
 
0.1%

Correlations

2022-08-11T23:11:44.515494image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-08-11T23:11:44.725980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-08-11T23:11:41.321555image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-11T23:11:41.649721image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

시도명시군구명학교급학교명교사 내부 관제 (대)교사 외부 관제 (대)총 합계 (대)화소 50만 이하 (대)화소 51~100만 미만 (대)화소 100~130만 미만 (대)화소 130~200만 미만 (대)화소 200만 이상 (대)데이터기준일자
0SIDO_NMGOV_NMSCHOOL_GRADESCHOOL_NMINTERNAL_CCTVEXTERNAL_CCTVCCTV_TOTALLESS_THEN_P50P51_P100P100_P130P130_P200MORE_THEN_P200STD_DT
1강원도화천군화천중학교491300001320200224
2강원도화천군상서중학교0151500001520200224
3강원도화천군사내중학교3182100002120200224
4강원도화천군간동중학교0101000001020200224
5강원도양구군해안중학교0121200001220200224
6강원도양구군양구중학교1131400001420200224
7강원도양구군대암중학교0111100001120200224
8강원도양구군방산중학교0440000420200224
9강원도양구군석천중학교0191900001920200224

Last rows

시도명시군구명학교급학교명교사 내부 관제 (대)교사 외부 관제 (대)총 합계 (대)화소 50만 이하 (대)화소 51~100만 미만 (대)화소 100~130만 미만 (대)화소 130~200만 미만 (대)화소 200만 이상 (대)데이터기준일자
657강원도정선군고한중학교0101000001020200224
658강원도정선군나전중학교0111100001120200224
659강원도정선군사북중학교0161600001620200224
660강원도정선군여량중학교0151500001520200224
661강원도정선군임계중학교0161600001620200224
662강원도정선군정선중학교0313100003120200224
663강원도정선군함백중학교3182100002120200224
664강원도정선군화동중학교0111100001120200224
665강원도정선군문곡중학교2182000002020200224
666강원도철원군김화여자중학교671300001320200224