Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Categorical4
Text1
Numeric2
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15972/S/1/datasetView.do

Alerts

기관 명 has constant value ""Constant
모델명 has constant value ""Constant
불량여부(정상:0, 불량:1) is highly imbalanced (91.6%)Imbalance
위도 is highly skewed (γ1 = -64.33826847)Skewed
경도 is highly skewed (γ1 = -57.71034948)Skewed

Reproduction

Analysis started2024-05-11 15:58:40.309627
Analysis finished2024-05-11 15:58:44.807261
Duration4.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관 명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
노원구
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노원구
2nd row노원구
3rd row노원구
4th row노원구
5th row노원구

Common Values

ValueCountFrequency (%)
노원구 10000
100.0%

Length

2024-05-12T00:58:45.001415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-12T00:58:45.300814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
노원구 10000
100.0%

모델명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
GDS-100T
10000 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGDS-100T
2nd rowGDS-100T
3rd rowGDS-100T
4th rowGDS-100T
5th rowGDS-100T

Common Values

ValueCountFrequency (%)
GDS-100T 10000
100.0%

Length

2024-05-12T00:58:45.602880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-12T00:58:45.887683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
gds-100t 10000
100.0%
Distinct4067
Distinct (%)40.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:58:46.836107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.6458
Min length7

Characters and Unicode

Total characters76458
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1080 ?
Unique (%)10.8%

Sample

1st row19-2792
2nd rowlr-0005025
3rd row12-0228
4th row2019-0001119
5th row12-8516
ValueCountFrequency (%)
2019-0004213 9
 
0.1%
16-4210 8
 
0.1%
12-7346 8
 
0.1%
12-1090 8
 
0.1%
21-8920 7
 
0.1%
15-0836 7
 
0.1%
15-5281 7
 
0.1%
16-6936 7
 
0.1%
2019-0001142 7
 
0.1%
13-0980 7
 
0.1%
Other values (4057) 9925
99.2%
2024-05-12T00:58:48.250296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12996
17.0%
2 10665
13.9%
- 10000
13.1%
0 9575
12.5%
5 6535
8.5%
9 5035
 
6.6%
6 4726
 
6.2%
3 4448
 
5.8%
4 4314
 
5.6%
7 4175
 
5.5%
Other values (3) 3989
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 66406
86.9%
Dash Punctuation 10000
 
13.1%
Lowercase Letter 52
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12996
19.6%
2 10665
16.1%
0 9575
14.4%
5 6535
9.8%
9 5035
 
7.6%
6 4726
 
7.1%
3 4448
 
6.7%
4 4314
 
6.5%
7 4175
 
6.3%
8 3937
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
l 26
50.0%
r 26
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 76406
99.9%
Latin 52
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12996
17.0%
2 10665
14.0%
- 10000
13.1%
0 9575
12.5%
5 6535
8.6%
9 5035
 
6.6%
6 4726
 
6.2%
3 4448
 
5.8%
4 4314
 
5.6%
7 4175
 
5.5%
Latin
ValueCountFrequency (%)
l 26
50.0%
r 26
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 76458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12996
17.0%
2 10665
13.9%
- 10000
13.1%
0 9575
12.5%
5 6535
8.5%
9 5035
 
6.6%
6 4726
 
6.2%
3 4448
 
5.8%
4 4314
 
5.6%
7 4175
 
5.5%
Other values (3) 3989
 
5.2%

위도
Real number (ℝ)

SKEWED 

Distinct3870
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.644485
Minimum31.629479
Maximum37.929737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:58:48.669724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum31.629479
5-th percentile37.618969
Q137.624459
median37.648126
Q337.664067
95-th percentile37.675446
Maximum37.929737
Range6.300258
Interquartile range (IQR)0.03960875

Descriptive statistics

Standard deviation0.087788925
Coefficient of variation (CV)0.0023320528
Kurtosis4407.8868
Mean37.644485
Median Absolute Deviation (MAD)0.02139
Skewness-64.338268
Sum376444.85
Variance0.0077068954
MonotonicityNot monotonic
2024-05-12T00:58:49.107924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.62444 10
 
0.1%
37.641679 10
 
0.1%
37.621931 9
 
0.1%
37.656207 9
 
0.1%
37.656054 9
 
0.1%
37.663114 9
 
0.1%
37.620512 9
 
0.1%
37.663622 8
 
0.1%
37.627862 8
 
0.1%
37.624042 8
 
0.1%
Other values (3860) 9911
99.1%
ValueCountFrequency (%)
31.629479 2
 
< 0.1%
37.323564 1
 
< 0.1%
37.61207 2
 
< 0.1%
37.614475 5
0.1%
37.614544 1
 
< 0.1%
37.614683 3
< 0.1%
37.614823 6
0.1%
37.614824 3
< 0.1%
37.615128 1
 
< 0.1%
37.615143 3
< 0.1%
ValueCountFrequency (%)
37.929737 4
< 0.1%
37.68483 3
< 0.1%
37.684206 2
 
< 0.1%
37.684121 2
 
< 0.1%
37.683925 3
< 0.1%
37.683732 4
< 0.1%
37.683705 1
 
< 0.1%
37.68347 5
0.1%
37.683428 3
< 0.1%
37.683425 1
 
< 0.1%

경도
Real number (ℝ)

SKEWED 

Distinct3818
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.0431
Minimum37.624061
Maximum127.70959
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:58:49.670035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.624061
5-th percentile127.05154
Q1127.06047
median127.07095
Q3127.07752
95-th percentile127.08546
Maximum127.70959
Range90.085534
Interquartile range (IQR)0.017049

Descriptive statistics

Standard deviation1.5491585
Coefficient of variation (CV)0.012193961
Kurtosis3329.4344
Mean127.0431
Median Absolute Deviation (MAD)0.007812
Skewness-57.710349
Sum1270431
Variance2.3998922
MonotonicityNot monotonic
2024-05-12T00:58:50.091359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.075255 15
 
0.1%
127.058309 11
 
0.1%
127.06976 10
 
0.1%
127.058652 10
 
0.1%
127.069653 9
 
0.1%
127.059021 9
 
0.1%
127.07066 9
 
0.1%
127.075046 9
 
0.1%
127.059357 9
 
0.1%
127.077618 9
 
0.1%
Other values (3808) 9900
99.0%
ValueCountFrequency (%)
37.6240612 3
< 0.1%
127.040071 4
< 0.1%
127.041928 1
 
< 0.1%
127.042075 4
< 0.1%
127.042219 2
 
< 0.1%
127.042411 1
 
< 0.1%
127.042431 4
< 0.1%
127.042596 3
< 0.1%
127.042597 6
0.1%
127.042598 1
 
< 0.1%
ValueCountFrequency (%)
127.709595 2
 
< 0.1%
127.111781 1
 
< 0.1%
127.111452 5
0.1%
127.111332 1
 
< 0.1%
127.111159 2
 
< 0.1%
127.111155 4
< 0.1%
127.11091 3
< 0.1%
127.110735 4
< 0.1%
127.110713 2
 
< 0.1%
127.110352 3
< 0.1%

등 밝기(%)
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
5718 
100
4282 

Length

Max length3
Median length1
Mean length1.8564
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row100
3rd row100
4th row0
5th row100

Common Values

ValueCountFrequency (%)
0 5718
57.2%
100 4282
42.8%

Length

2024-05-12T00:58:50.528454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-12T00:58:50.855781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 5718
57.2%
100 4282
42.8%

불량여부(정상:0, 불량:1)
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9895 
1
 
105

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9895
99.0%
1 105
 
1.1%

Length

2024-05-12T00:58:51.171734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-12T00:58:51.461851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9895
99.0%
1 105
 
1.1%
Distinct8503
Distinct (%)85.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-12-25 00:09:32
Maximum2023-12-26 15:11:52
2024-05-12T00:58:51.772423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:58:52.193935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-12T00:58:43.586157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:58:43.060603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:58:43.846667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:58:43.333868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:58:52.461844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.0000.0000.0000.000
경도0.0001.0000.0160.000
등 밝기(%)0.0000.0161.0000.068
불량여부(정상:0, 불량:1)0.0000.0000.0681.000
2024-05-12T00:58:52.707320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
불량여부(정상:0, 불량:1)등 밝기(%)
불량여부(정상:0, 불량:1)1.0000.043
등 밝기(%)0.0431.000
2024-05-12T00:58:52.948163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.000-0.0940.0000.000
경도-0.0941.0000.0100.000
등 밝기(%)0.0000.0101.0000.043
불량여부(정상:0, 불량:1)0.0000.0000.0431.000

Missing values

2024-05-12T00:58:44.187009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:58:44.610916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
42919노원구GDS-100T19-279237.665005127.070381002023-12-25 17:19:54
56179노원구GDS-100Tlr-000502537.625706127.09205110002023-12-25 23:14:22
63269노원구GDS-100T12-022837.629266127.04667510002023-12-26 01:54:12
11262노원구GDS-100T2019-000111937.656909127.069002002023-12-25 04:19:24
72778노원구GDS-100T12-851637.626237127.07708610002023-12-26 06:11:37
61233노원구GDS-100T21-412537.62966127.05253310002023-12-26 01:46:38
80897노원구GDS-100T12-663537.662577127.06761210002023-12-26 09:13:49
8539노원구GDS-100T25-222137.67004127.080332002023-12-25 03:18:15
83148노원구GDS-100T16-694537.622323127.08843310002023-12-26 10:13:53
78276노원구GDS-100T15-726137.673913127.05733510002023-12-26 08:12:52
기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
90742노원구GDS-100T12-892037.62199127.07861810002023-12-26 13:16:03
91657노원구GDS-100T16-413937.674077127.05577110002023-12-26 13:19:54
10646노원구GDS-100T15-168037.621269127.080987002023-12-25 04:16:39
45188노원구GDS-100T12-493337.620044127.062235002023-12-25 18:19:13
90559노원구GDS-100T15-021637.625065127.07470610002023-12-26 13:15:17
23732노원구GDS-100T12-195437.621157127.061046002023-12-25 10:10:11
702노원구GDS-100T12-226137.648398127.081873002023-12-25 00:11:27
84132노원구GDS-100T21-706037.662322127.07105110002023-12-26 10:17:57
56262노원구GDS-100T17-158237.664582127.07107710002023-12-25 23:14:45
65101노원구GDS-100T15-971937.652502127.06183610002023-12-26 03:12:43