Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows872
Duplicate rows (%)8.7%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author강남구
URLhttps://data.seoul.go.kr/dataList/OA-11271/S/1/datasetView.do

Alerts

Dataset has 872 (8.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:28:02.112878
Analysis finished2024-05-11 16:28:07.026349
Duration4.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:07.148736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.329197
Coefficient of variation (CV)1.9925051 × 10-6
Kurtosis-1.4700426
Mean20240449
Median Absolute Deviation (MAD)9
Skewness0.68050765
Sum2.0240449 × 1011
Variance1626.4442
MonotonicityNot monotonic
2024-05-12T01:28:07.373018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240417 363
 
3.6%
20240411 358
 
3.6%
20240418 356
 
3.6%
20240505 344
 
3.4%
20240504 344
 
3.4%
20240421 342
 
3.4%
20240419 341
 
3.4%
20240416 340
 
3.4%
20240501 340
 
3.4%
20240415 340
 
3.4%
Other values (20) 6532
65.3%
ValueCountFrequency (%)
20240411 358
3.6%
20240412 335
3.4%
20240413 328
3.3%
20240414 313
3.1%
20240415 340
3.4%
20240416 340
3.4%
20240417 363
3.6%
20240418 356
3.6%
20240419 341
3.4%
20240420 326
3.3%
ValueCountFrequency (%)
20240510 326
3.3%
20240509 310
3.1%
20240508 336
3.4%
20240507 327
3.3%
20240506 321
3.2%
20240505 344
3.4%
20240504 344
3.4%
20240503 334
3.3%
20240502 328
3.3%
20240501 340
3.4%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12942.245
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:07.614294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5238.7691
Coefficient of variation (CV)0.40478055
Kurtosis1.2386915
Mean12942.245
Median Absolute Deviation (MAD)107
Skewness1.6201054
Sum1.2942245 × 108
Variance27444702
MonotonicityNot monotonic
2024-05-12T01:28:07.879340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20105 248
 
2.5%
10199 245
 
2.5%
10412 241
 
2.4%
10411 239
 
2.4%
10113 239
 
2.4%
10105 239
 
2.4%
10116 239
 
2.4%
10410 237
 
2.4%
24113 234
 
2.3%
10117 233
 
2.3%
Other values (49) 7606
76.1%
ValueCountFrequency (%)
10101 208
2.1%
10102 210
2.1%
10103 230
2.3%
10104 221
2.2%
10105 239
2.4%
10106 224
2.2%
10107 189
1.9%
10108 113
1.1%
10109 14
 
0.1%
10110 179
1.8%
ValueCountFrequency (%)
30111 122
1.2%
30110 64
 
0.6%
24205 227
2.3%
24201 186
1.9%
24113 234
2.3%
24101 90
 
0.9%
20399 151
1.5%
20301 232
2.3%
20199 103
1.0%
20107 192
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:28:08.570790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.1991
Min length2

Characters and Unicode

Total characters41991
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비어(바)살롱
2nd row통닭(치킨)
3rd row패스트푸드
4th row다방
5th row이용업 기타
ValueCountFrequency (%)
기타 915
 
8.9%
패스트푸드 465
 
4.5%
전통찻집 297
 
2.9%
관광호텔 269
 
2.6%
여관업 248
 
2.4%
커피숍 241
 
2.4%
통닭(치킨 239
 
2.3%
분식 239
 
2.3%
생선회 239
 
2.3%
편의점 237
 
2.3%
Other values (44) 6865
66.9%
2024-05-12T01:28:09.682597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1849
 
4.4%
1564
 
3.7%
( 1182
 
2.8%
) 1182
 
2.8%
1090
 
2.6%
999
 
2.4%
915
 
2.2%
915
 
2.2%
869
 
2.1%
776
 
1.8%
Other values (118) 30650
73.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39184
93.3%
Open Punctuation 1182
 
2.8%
Close Punctuation 1182
 
2.8%
Space Separator 254
 
0.6%
Other Punctuation 189
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1849
 
4.7%
1564
 
4.0%
1090
 
2.8%
999
 
2.5%
915
 
2.3%
915
 
2.3%
869
 
2.2%
776
 
2.0%
736
 
1.9%
675
 
1.7%
Other values (114) 28796
73.5%
Open Punctuation
ValueCountFrequency (%)
( 1182
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1182
100.0%
Space Separator
ValueCountFrequency (%)
254
100.0%
Other Punctuation
ValueCountFrequency (%)
, 189
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39184
93.3%
Common 2807
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1849
 
4.7%
1564
 
4.0%
1090
 
2.8%
999
 
2.5%
915
 
2.3%
915
 
2.3%
869
 
2.2%
776
 
2.0%
736
 
1.9%
675
 
1.7%
Other values (114) 28796
73.5%
Common
ValueCountFrequency (%)
( 1182
42.1%
) 1182
42.1%
254
 
9.0%
, 189
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39184
93.3%
ASCII 2807
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1849
 
4.7%
1564
 
4.0%
1090
 
2.8%
999
 
2.5%
915
 
2.3%
915
 
2.3%
869
 
2.2%
776
 
2.0%
736
 
1.9%
675
 
1.7%
Other values (114) 28796
73.5%
ASCII
ValueCountFrequency (%)
( 1182
42.1%
) 1182
42.1%
254
 
9.0%
, 189
 
6.7%

건수
Real number (ℝ)

Distinct727
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.1021
Minimum1
Maximum4929
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:10.091237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24.5
Q3138
95-th percentile624
Maximum4929
Range4928
Interquartile range (IQR)134

Descriptive statistics

Standard deviation397.59063
Coefficient of variation (CV)2.5147713
Kurtosis47.142606
Mean158.1021
Median Absolute Deviation (MAD)23.5
Skewness5.90409
Sum1581021
Variance158078.31
MonotonicityNot monotonic
2024-05-12T01:28:10.689930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1108
 
11.1%
2 743
 
7.4%
4 457
 
4.6%
3 376
 
3.8%
5 298
 
3.0%
7 206
 
2.1%
6 190
 
1.9%
8 188
 
1.9%
9 154
 
1.5%
14 151
 
1.5%
Other values (717) 6129
61.3%
ValueCountFrequency (%)
1 1108
11.1%
2 743
7.4%
3 376
 
3.8%
4 457
4.6%
5 298
 
3.0%
6 190
 
1.9%
7 206
 
2.1%
8 188
 
1.9%
9 154
 
1.5%
10 82
 
0.8%
ValueCountFrequency (%)
4929 1
 
< 0.1%
4926 5
0.1%
4925 1
 
< 0.1%
4924 1
 
< 0.1%
4923 1
 
< 0.1%
4922 4
< 0.1%
4921 1
 
< 0.1%
4916 1
 
< 0.1%
3227 2
 
< 0.1%
3221 2
 
< 0.1%

Interactions

2024-05-12T01:28:06.264680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:04.695593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:05.490331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:06.418960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:04.957391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:05.758703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:06.595178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:05.236984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:06.045216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:28:10.951207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0180.0000.000
청소년유해업소업종코드0.0181.0000.9990.144
청소년유해업소업종명0.0000.9991.0000.737
건수0.0000.1440.7371.000
2024-05-12T01:28:11.201438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.007-0.013
청소년유해업소업종코드0.0071.000-0.263
건수-0.013-0.2631.000

Missing values

2024-05-12T01:28:06.791476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:28:06.949225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
198722024042910207비어(바)살롱6
110212024042110113통닭(치킨)73
23342024041310411패스트푸드50
122602024042210402다방8
270322024050520399이용업 기타3
90302024041910104일식141
202292024042910409백화점4
266382024050510412커피숍1464
14242024041210115김밥(도시락)19
116372024042110116생선회32
처리일자청소년유해업소업종코드청소년유해업소업종명건수
288502024050710408철도역구내6
207042024042910113통닭(치킨)160
193302024042810111패스트푸드16
62062024041620101관광호텔18
27232024041310104일식995
290712024050720105여관업10
160292024042510499기타283
254502024050420107여인숙업4
89082024041910203관광호텔나이트(디스코)1
147522024042420199숙박업 기타2

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
232024041110413전통찻집15
402024041210201카바레25
1912024041710413전통찻집15
2242024041810409백화점15
2262024041810413전통찻집15
2352024041824201비디오물감상실업15
2442024041910115김밥(도시락)155
4202024042510201카바레25
5572024042910413전통찻집15
5812024043010413전통찻집15