Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Categorical4
Text1
Numeric2
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15972/S/1/datasetView.do

Alerts

기관 명 has constant value ""Constant
모델명 has constant value ""Constant
불량여부(정상:0, 불량:1) is highly imbalanced (92.5%)Imbalance
위도 is highly skewed (γ1 = -54.14532231)Skewed
경도 is highly skewed (γ1 = -57.71042338)Skewed

Reproduction

Analysis started2024-05-04 00:44:11.507301
Analysis finished2024-05-04 00:44:13.563172
Duration2.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관 명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
노원구
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노원구
2nd row노원구
3rd row노원구
4th row노원구
5th row노원구

Common Values

ValueCountFrequency (%)
노원구 10000
100.0%

Length

2024-05-04T00:44:13.770420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:44:14.056036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
노원구 10000
100.0%

모델명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
GDS-100T
10000 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGDS-100T
2nd rowGDS-100T
3rd rowGDS-100T
4th rowGDS-100T
5th rowGDS-100T

Common Values

ValueCountFrequency (%)
GDS-100T 10000
100.0%

Length

2024-05-04T00:44:14.358359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:44:14.706675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
gds-100t 10000
100.0%
Distinct4065
Distinct (%)40.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T00:44:15.551056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.6537
Min length7

Characters and Unicode

Total characters76537
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1022 ?
Unique (%)10.2%

Sample

1st row21-3647
2nd row21-4896
3rd row12-4817
4th row15-4166
5th row20-7204
ValueCountFrequency (%)
17-5829 9
 
0.1%
25-4607 8
 
0.1%
2019-0000531 8
 
0.1%
12-4538 8
 
0.1%
16-7862 8
 
0.1%
2019-0004203 7
 
0.1%
15-7703 7
 
0.1%
13-1369 7
 
0.1%
25-7853 7
 
0.1%
15-7038 7
 
0.1%
Other values (4055) 9924
99.2%
2024-05-04T00:44:16.612317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12984
17.0%
2 10543
13.8%
- 10000
13.1%
0 9672
12.6%
5 6579
8.6%
9 5097
 
6.7%
6 4677
 
6.1%
4 4468
 
5.8%
3 4438
 
5.8%
7 4166
 
5.4%
Other values (3) 3913
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 66509
86.9%
Dash Punctuation 10000
 
13.1%
Lowercase Letter 28
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12984
19.5%
2 10543
15.9%
0 9672
14.5%
5 6579
9.9%
9 5097
 
7.7%
6 4677
 
7.0%
4 4468
 
6.7%
3 4438
 
6.7%
7 4166
 
6.3%
8 3885
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
l 14
50.0%
r 14
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 76509
> 99.9%
Latin 28
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12984
17.0%
2 10543
13.8%
- 10000
13.1%
0 9672
12.6%
5 6579
8.6%
9 5097
 
6.7%
6 4677
 
6.1%
4 4468
 
5.8%
3 4438
 
5.8%
7 4166
 
5.4%
Latin
ValueCountFrequency (%)
l 14
50.0%
r 14
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 76537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12984
17.0%
2 10543
13.8%
- 10000
13.1%
0 9672
12.6%
5 6579
8.6%
9 5097
 
6.7%
6 4677
 
6.1%
4 4468
 
5.8%
3 4438
 
5.8%
7 4166
 
5.4%
Other values (3) 3913
 
5.1%

위도
Real number (ℝ)

SKEWED 

Distinct3876
Distinct (%)38.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.644203
Minimum31.629479
Maximum37.929737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T00:44:17.028706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum31.629479
5-th percentile37.618798
Q137.624441
median37.649351
Q337.664303
95-th percentile37.675493
Maximum37.929737
Range6.300258
Interquartile range (IQR)0.039862

Descriptive statistics

Standard deviation0.10644024
Coefficient of variation (CV)0.0028275333
Kurtosis3058.0431
Mean37.644203
Median Absolute Deviation (MAD)0.0211395
Skewness-54.145322
Sum376442.03
Variance0.011329524
MonotonicityNot monotonic
2024-05-04T00:44:17.428872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.675216 10
 
0.1%
37.672942 10
 
0.1%
37.658536 9
 
0.1%
37.66089 9
 
0.1%
37.66163 9
 
0.1%
37.623904 9
 
0.1%
37.676001 9
 
0.1%
37.622302 9
 
0.1%
37.673793 9
 
0.1%
37.624625 8
 
0.1%
Other values (3866) 9909
99.1%
ValueCountFrequency (%)
31.629479 3
< 0.1%
37.323564 2
< 0.1%
37.614475 2
< 0.1%
37.614544 2
< 0.1%
37.614683 2
< 0.1%
37.614823 4
< 0.1%
37.614824 4
< 0.1%
37.615128 1
 
< 0.1%
37.615143 1
 
< 0.1%
37.615261 2
< 0.1%
ValueCountFrequency (%)
37.929737 3
< 0.1%
37.68483 3
< 0.1%
37.684206 6
0.1%
37.684121 2
 
< 0.1%
37.683963 2
 
< 0.1%
37.683925 3
< 0.1%
37.683732 2
 
< 0.1%
37.68347 4
< 0.1%
37.683428 2
 
< 0.1%
37.683425 1
 
< 0.1%

경도
Real number (ℝ)

SKEWED 

Distinct3836
Distinct (%)38.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.04293
Minimum37.624061
Maximum127.70959
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T00:44:17.805638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.624061
5-th percentile127.05169
Q1127.06019
median127.0708
Q3127.0774
95-th percentile127.0854
Maximum127.70959
Range90.085534
Interquartile range (IQR)0.0172165

Descriptive statistics

Standard deviation1.5491549
Coefficient of variation (CV)0.012193949
Kurtosis3329.4401
Mean127.04293
Median Absolute Deviation (MAD)0.0079475
Skewness-57.710423
Sum1270429.3
Variance2.399881
MonotonicityNot monotonic
2024-05-04T00:44:18.227364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.058526 12
 
0.1%
127.058755 12
 
0.1%
127.073518 11
 
0.1%
127.058309 10
 
0.1%
127.084139 10
 
0.1%
127.070363 10
 
0.1%
127.052667 10
 
0.1%
127.076381 10
 
0.1%
127.069653 9
 
0.1%
127.080813 9
 
0.1%
Other values (3826) 9897
99.0%
ValueCountFrequency (%)
37.6240612 3
< 0.1%
127.040071 4
< 0.1%
127.041928 3
< 0.1%
127.04203 3
< 0.1%
127.042075 4
< 0.1%
127.042219 1
 
< 0.1%
127.042431 1
 
< 0.1%
127.042597 4
< 0.1%
127.042598 4
< 0.1%
127.042636 3
< 0.1%
ValueCountFrequency (%)
127.709595 2
 
< 0.1%
127.111781 4
< 0.1%
127.111452 1
 
< 0.1%
127.111332 1
 
< 0.1%
127.111225 2
 
< 0.1%
127.111159 1
 
< 0.1%
127.111155 6
0.1%
127.11091 3
< 0.1%
127.110735 1
 
< 0.1%
127.110713 2
 
< 0.1%

등 밝기(%)
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
100
6053 
0
3947 

Length

Max length3
Median length3
Mean length2.2106
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row100
4th row100
5th row100

Common Values

ValueCountFrequency (%)
100 6053
60.5%
0 3947
39.5%

Length

2024-05-04T00:44:18.666369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:44:19.001995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
100 6053
60.5%
0 3947
39.5%

불량여부(정상:0, 불량:1)
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9908 
1
 
92

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9908
99.1%
1 92
 
0.9%

Length

2024-05-04T00:44:19.320463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:44:19.612176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9908
99.1%
1 92
 
0.9%
Distinct8100
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-02-05 00:10:01
Maximum2024-02-06 10:11:54
2024-05-04T00:44:19.944999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:44:20.457161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-04T00:44:12.684343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:44:12.232361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:44:12.947814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:44:12.431349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T00:44:20.737061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.0000.0000.0000.000
경도0.0001.0000.0000.000
등 밝기(%)0.0000.0001.0000.097
불량여부(정상:0, 불량:1)0.0000.0000.0971.000
2024-05-04T00:44:21.001139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등 밝기(%)불량여부(정상:0, 불량:1)
등 밝기(%)1.0000.062
불량여부(정상:0, 불량:1)0.0621.000
2024-05-04T00:44:21.250463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.000-0.0900.0000.000
경도-0.0901.0000.0000.000
등 밝기(%)0.0000.0001.0000.062
불량여부(정상:0, 불량:1)0.0000.0000.0621.000

Missing values

2024-05-04T00:44:13.164150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T00:44:13.430878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
17301노원구GDS-100T21-364737.644418127.051242002024-02-05 07:12:00
21656노원구GDS-100T21-489637.62568127.049072002024-02-05 08:19:45
72648노원구GDS-100T12-481737.678358127.05569210002024-02-06 07:10:33
43135노원구GDS-100T15-416637.673607127.05190110002024-02-05 18:18:37
39885노원구GDS-100T20-720437.663064127.0705810002024-02-05 17:13:45
65988노원구GDS-100T15-016137.662964127.06988710002024-02-06 04:15:07
1582노원구GDS-100T12-512437.624932127.074521002024-02-05 00:14:14
23719노원구GDS-100T12-173937.616314127.06620110002024-02-05 09:19:26
49065노원구GDS-100T22-398637.664842127.07136410002024-02-05 21:14:20
13086노원구GDS-100T21-683237.666846127.069372002024-02-05 05:15:13
기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
22858노원구GDS-100T25-429937.671108127.086674002024-02-05 09:15:59
25508노원구GDS-100T16-746037.661423127.079885012024-02-05 10:18:36
12523노원구GDS-100T16-855937.63664127.078558002024-02-05 05:12:58
49209노원구GDS-100T25-834037.629048127.04596210002024-02-05 21:14:57
26910노원구GDS-100T25-692837.668926127.079026002024-02-05 11:15:32
1382노원구GDS-100T2019-000420737.658736127.062062002024-02-05 00:13:39
38876노원구GDS-100T16-812537.621297127.08198910002024-02-05 16:18:17
61161노원구GDS-100T16-626137.650953127.06210910002024-02-06 01:56:07
44505노원구GDS-100T21-726437.661473127.06675510002024-02-05 19:14:24
28958노원구GDS-100T13-028237.623232127.0518510002024-02-05 12:14:08