Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows870
Duplicate rows (%)8.7%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author중랑구
URLhttps://data.seoul.go.kr/dataList/OA-10270/S/1/datasetView.do

Alerts

Dataset has 870 (8.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-06 10:13:55.866815
Analysis finished2024-04-06 10:13:58.647139
Duration2.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240332
Minimum20240306
Maximum20240405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:13:58.773633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240306
5-th percentile20240307
Q120240313
median20240321
Q320240329
95-th percentile20240404
Maximum20240405
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation31.738964
Coefficient of variation (CV)1.5681049 × 10-6
Kurtosis1.0865281
Mean20240332
Median Absolute Deviation (MAD)8
Skewness1.6639609
Sum2.0240332 × 1011
Variance1007.3618
MonotonicityNot monotonic
2024-04-06T19:13:58.976938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20240328 353
 
3.5%
20240326 347
 
3.5%
20240311 338
 
3.4%
20240401 338
 
3.4%
20240330 337
 
3.4%
20240315 335
 
3.4%
20240308 334
 
3.3%
20240331 331
 
3.3%
20240323 330
 
3.3%
20240319 330
 
3.3%
Other values (21) 6627
66.3%
ValueCountFrequency (%)
20240306 327
3.3%
20240307 303
3.0%
20240308 334
3.3%
20240309 311
3.1%
20240310 321
3.2%
20240311 338
3.4%
20240312 310
3.1%
20240313 306
3.1%
20240314 329
3.3%
20240315 335
3.4%
ValueCountFrequency (%)
20240405 321
3.2%
20240404 319
3.2%
20240403 324
3.2%
20240402 301
3.0%
20240401 338
3.4%
20240331 331
3.3%
20240330 337
3.4%
20240329 302
3.0%
20240328 353
3.5%
20240327 323
3.2%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12838.429
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:13:59.190473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5181.7098
Coefficient of variation (CV)0.40360935
Kurtosis1.4979514
Mean12838.429
Median Absolute Deviation (MAD)105
Skewness1.6925654
Sum1.2838429 × 108
Variance26850117
MonotonicityNot monotonic
2024-04-06T19:13:59.547085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10117 249
 
2.5%
10199 246
 
2.5%
20101 242
 
2.4%
10112 241
 
2.4%
10411 238
 
2.4%
10410 238
 
2.4%
10107 236
 
2.4%
10412 236
 
2.4%
10111 232
 
2.3%
10103 230
 
2.3%
Other values (49) 7612
76.1%
ValueCountFrequency (%)
10101 213
2.1%
10102 228
2.3%
10103 230
2.3%
10104 212
2.1%
10105 228
2.3%
10106 202
2.0%
10107 236
2.4%
10108 112
1.1%
10109 23
 
0.2%
10110 188
1.9%
ValueCountFrequency (%)
30111 126
1.3%
30110 62
 
0.6%
24205 228
2.3%
24201 181
1.8%
24113 210
2.1%
24101 89
 
0.9%
20399 147
1.5%
20301 215
2.1%
20199 83
 
0.8%
20107 190
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-06T19:13:59.896204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2494
Min length2

Characters and Unicode

Total characters42494
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row패스트푸드
2nd row패스트푸드
3rd row극장식당
4th row복어취급
5th row단란주점
ValueCountFrequency (%)
기타 860
 
8.4%
패스트푸드 470
 
4.6%
전통찻집 323
 
3.2%
관광호텔 287
 
2.8%
까페 249
 
2.4%
호프(소주방 241
 
2.4%
편의점 238
 
2.3%
커피숍 236
 
2.3%
정종,대포집(선술집 236
 
2.3%
식육취급 230
 
2.2%
Other values (44) 6860
67.1%
2024-04-06T19:14:01.098935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1742
 
4.1%
1566
 
3.7%
) 1264
 
3.0%
( 1264
 
3.0%
1019
 
2.4%
1018
 
2.4%
860
 
2.0%
860
 
2.0%
815
 
1.9%
806
 
1.9%
Other values (118) 31280
73.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39500
93.0%
Close Punctuation 1264
 
3.0%
Open Punctuation 1264
 
3.0%
Other Punctuation 236
 
0.6%
Space Separator 230
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1742
 
4.4%
1566
 
4.0%
1019
 
2.6%
1018
 
2.6%
860
 
2.2%
860
 
2.2%
815
 
2.1%
806
 
2.0%
795
 
2.0%
775
 
2.0%
Other values (114) 29244
74.0%
Close Punctuation
ValueCountFrequency (%)
) 1264
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1264
100.0%
Other Punctuation
ValueCountFrequency (%)
, 236
100.0%
Space Separator
ValueCountFrequency (%)
230
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39500
93.0%
Common 2994
 
7.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1742
 
4.4%
1566
 
4.0%
1019
 
2.6%
1018
 
2.6%
860
 
2.2%
860
 
2.2%
815
 
2.1%
806
 
2.0%
795
 
2.0%
775
 
2.0%
Other values (114) 29244
74.0%
Common
ValueCountFrequency (%)
) 1264
42.2%
( 1264
42.2%
, 236
 
7.9%
230
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39500
93.0%
ASCII 2994
 
7.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1742
 
4.4%
1566
 
4.0%
1019
 
2.6%
1018
 
2.6%
860
 
2.2%
860
 
2.2%
815
 
2.1%
806
 
2.0%
795
 
2.0%
775
 
2.0%
Other values (114) 29244
74.0%
ASCII
ValueCountFrequency (%)
) 1264
42.2%
( 1264
42.2%
, 236
 
7.9%
230
 
7.7%

건수
Real number (ℝ)

Distinct711
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.8499
Minimum1
Maximum4914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:14:01.389659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3138
95-th percentile630
Maximum4914
Range4913
Interquartile range (IQR)134

Descriptive statistics

Standard deviation393.53372
Coefficient of variation (CV)2.4773936
Kurtosis38.161475
Mean158.8499
Median Absolute Deviation (MAD)23
Skewness5.4145577
Sum1588499
Variance154868.79
MonotonicityNot monotonic
2024-04-06T19:14:01.740773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1108
 
11.1%
2 709
 
7.1%
4 463
 
4.6%
3 377
 
3.8%
5 324
 
3.2%
8 218
 
2.2%
7 209
 
2.1%
6 173
 
1.7%
11 157
 
1.6%
14 149
 
1.5%
Other values (701) 6113
61.1%
ValueCountFrequency (%)
1 1108
11.1%
2 709
7.1%
3 377
 
3.8%
4 463
4.6%
5 324
 
3.2%
6 173
 
1.7%
7 209
 
2.1%
8 218
 
2.2%
9 143
 
1.4%
10 90
 
0.9%
ValueCountFrequency (%)
4914 1
< 0.1%
4913 1
< 0.1%
4912 1
< 0.1%
4903 1
< 0.1%
4901 1
< 0.1%
4897 1
< 0.1%
4896 2
< 0.1%
3228 2
< 0.1%
3226 1
< 0.1%
3225 2
< 0.1%

Interactions

2024-04-06T19:13:57.871159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:56.651247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:57.305112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:58.024554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:56.859538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:57.484844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:58.209983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:57.129071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:13:57.688153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T19:14:02.014520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.137
청소년유해업소업종명0.0000.9991.0000.730
건수0.0000.1370.7301.000
2024-04-06T19:14:02.235993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.007-0.002
청소년유해업소업종코드-0.0071.000-0.261
건수-0.002-0.2611.000

Missing values

2024-04-06T19:13:58.405074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T19:13:58.576396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
277262024031110411패스트푸드105
196672024031910411패스트푸드49
275562024031210205극장식당4
150932024032310114복어취급11
204482024031810301단란주점120
276392024031120199숙박업 기타2
38032024040210210간이주점2
59552024033110112호프(소주방)340
198072024031910201카바레2
172112024032110112호프(소주방)340
처리일자청소년유해업소업종코드청소년유해업소업종명건수
40532024040220101관광호텔21
27752024040310114복어취급2
295742024031010412커피숍460
193032024031910114복어취급11
239272024031510199기타634
189012024031910411패스트푸드38
212222024031710412커피숍539
82242024032920101관광호텔11
72592024033010410편의점275
147922024032324205노래연습장업172

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
1072024030924201비디오물감상실업17
1452024031110201카바레27
3292024031724201비디오물감상실업16
6432024032820399이용업 기타16
492024030710413전통찻집15
2802024031610202고고(디스코)클럽15
3002024031624201비디오물감상실업15
3832024031920399이용업 기타15
4342024032110413전통찻집15
4652024032210408철도역구내15