Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows931
Duplicate rows (%)9.3%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author광진구
URLhttps://data.seoul.go.kr/dataList/OA-9885/S/1/datasetView.do

Alerts

Dataset has 931 (9.3%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-06 11:05:59.183971
Analysis finished2024-04-06 11:06:02.041486
Duration2.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240332
Minimum20240306
Maximum20240405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T20:06:02.144890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240306
5-th percentile20240307
Q120240313
median20240321
Q320240329
95-th percentile20240404
Maximum20240405
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation31.869133
Coefficient of variation (CV)1.5745361 × 10-6
Kurtosis1.0190026
Mean20240332
Median Absolute Deviation (MAD)8
Skewness1.6456606
Sum2.0240332 × 1011
Variance1015.6416
MonotonicityNot monotonic
2024-04-06T20:06:02.368611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20240319 352
 
3.5%
20240325 346
 
3.5%
20240321 345
 
3.5%
20240329 341
 
3.4%
20240402 339
 
3.4%
20240306 337
 
3.4%
20240318 335
 
3.4%
20240310 334
 
3.3%
20240328 334
 
3.3%
20240403 332
 
3.3%
Other values (21) 6605
66.0%
ValueCountFrequency (%)
20240306 337
3.4%
20240307 302
3.0%
20240308 307
3.1%
20240309 298
3.0%
20240310 334
3.3%
20240311 311
3.1%
20240312 325
3.2%
20240313 325
3.2%
20240314 285
2.9%
20240315 301
3.0%
ValueCountFrequency (%)
20240405 304
3.0%
20240404 328
3.3%
20240403 332
3.3%
20240402 339
3.4%
20240401 322
3.2%
20240331 323
3.2%
20240330 321
3.2%
20240329 341
3.4%
20240328 334
3.3%
20240327 328
3.3%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12844.941
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T20:06:02.597086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5159.7114
Coefficient of variation (CV)0.40169211
Kurtosis1.3860038
Mean12844.941
Median Absolute Deviation (MAD)107
Skewness1.6672422
Sum1.2844941 × 108
Variance26622622
MonotonicityNot monotonic
2024-04-06T20:06:02.907572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10410 251
 
2.5%
10301 249
 
2.5%
10102 240
 
2.4%
10412 238
 
2.4%
10101 237
 
2.4%
10401 237
 
2.4%
10403 234
 
2.3%
10111 233
 
2.3%
10199 233
 
2.3%
24205 230
 
2.3%
Other values (49) 7618
76.2%
ValueCountFrequency (%)
10101 237
2.4%
10102 240
2.4%
10103 218
2.2%
10104 221
2.2%
10105 224
2.2%
10106 209
2.1%
10107 189
1.9%
10108 130
1.3%
10109 16
 
0.2%
10110 186
1.9%
ValueCountFrequency (%)
30111 101
1.0%
30110 68
 
0.7%
24205 230
2.3%
24201 193
1.9%
24113 228
2.3%
24101 80
 
0.8%
20399 149
1.5%
20301 218
2.2%
20199 87
 
0.9%
20107 195
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-06T20:06:03.272444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2146
Min length2

Characters and Unicode

Total characters42146
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정종,대포집(선술집)
2nd row숙박업 기타
3rd row일반조리판매
4th row일식
5th row편의점
ValueCountFrequency (%)
기타 866
 
8.5%
패스트푸드 461
 
4.5%
전통찻집 315
 
3.1%
관광호텔 254
 
2.5%
편의점 251
 
2.5%
단란주점 249
 
2.4%
중국식 240
 
2.3%
커피숍 238
 
2.3%
한식 237
 
2.3%
과자점 237
 
2.3%
Other values (44) 6888
67.3%
2024-04-06T20:06:03.784801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1777
 
4.2%
1595
 
3.8%
) 1190
 
2.8%
( 1190
 
2.8%
1107
 
2.6%
1081
 
2.6%
866
 
2.1%
866
 
2.1%
860
 
2.0%
779
 
1.8%
Other values (118) 30835
73.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39341
93.3%
Close Punctuation 1190
 
2.8%
Open Punctuation 1190
 
2.8%
Space Separator 236
 
0.6%
Other Punctuation 189
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1777
 
4.5%
1595
 
4.1%
1107
 
2.8%
1081
 
2.7%
866
 
2.2%
866
 
2.2%
860
 
2.2%
779
 
2.0%
727
 
1.8%
693
 
1.8%
Other values (114) 28990
73.7%
Close Punctuation
ValueCountFrequency (%)
) 1190
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1190
100.0%
Space Separator
ValueCountFrequency (%)
236
100.0%
Other Punctuation
ValueCountFrequency (%)
, 189
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39341
93.3%
Common 2805
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1777
 
4.5%
1595
 
4.1%
1107
 
2.8%
1081
 
2.7%
866
 
2.2%
866
 
2.2%
860
 
2.2%
779
 
2.0%
727
 
1.8%
693
 
1.8%
Other values (114) 28990
73.7%
Common
ValueCountFrequency (%)
) 1190
42.4%
( 1190
42.4%
236
 
8.4%
, 189
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39341
93.3%
ASCII 2805
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1777
 
4.5%
1595
 
4.1%
1107
 
2.8%
1081
 
2.7%
866
 
2.2%
866
 
2.2%
860
 
2.2%
779
 
2.0%
727
 
1.8%
693
 
1.8%
Other values (114) 28990
73.7%
ASCII
ValueCountFrequency (%)
) 1190
42.4%
( 1190
42.4%
236
 
8.4%
, 189
 
6.7%

건수
Real number (ℝ)

Distinct728
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.4296
Minimum1
Maximum4914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T20:06:04.010995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3141
95-th percentile629
Maximum4914
Range4913
Interquartile range (IQR)137

Descriptive statistics

Standard deviation394.95443
Coefficient of variation (CV)2.4618551
Kurtosis36.351411
Mean160.4296
Median Absolute Deviation (MAD)23
Skewness5.3104055
Sum1604296
Variance155989
MonotonicityNot monotonic
2024-04-06T20:06:04.262295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1102
 
11.0%
2 784
 
7.8%
4 479
 
4.8%
3 389
 
3.9%
5 309
 
3.1%
8 203
 
2.0%
7 201
 
2.0%
6 183
 
1.8%
9 157
 
1.6%
11 136
 
1.4%
Other values (718) 6057
60.6%
ValueCountFrequency (%)
1 1102
11.0%
2 784
7.8%
3 389
 
3.9%
4 479
4.8%
5 309
 
3.1%
6 183
 
1.8%
7 201
 
2.0%
8 203
 
2.0%
9 157
 
1.6%
10 85
 
0.9%
ValueCountFrequency (%)
4914 1
 
< 0.1%
4913 1
 
< 0.1%
4912 1
 
< 0.1%
4907 1
 
< 0.1%
4903 1
 
< 0.1%
4902 1
 
< 0.1%
4898 1
 
< 0.1%
3229 1
 
< 0.1%
3228 4
< 0.1%
3226 1
 
< 0.1%

Interactions

2024-04-06T20:06:01.153640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:05:59.879769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:00.474225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:01.340115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:00.065128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:00.686471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:01.560374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:00.275151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:06:00.921754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T20:06:04.445106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0270.0000.000
청소년유해업소업종코드0.0271.0000.9990.136
청소년유해업소업종명0.0000.9991.0000.734
건수0.0000.1360.7341.000
2024-04-06T20:06:04.705601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.0130.006
청소년유해업소업종코드-0.0131.000-0.258
건수0.006-0.2581.000

Missing values

2024-04-06T20:06:01.777807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T20:06:01.966503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
138362024031810107정종,대포집(선술집)61
92322024031420199숙박업 기타1
108612024031510403일반조리판매127
337082024040510104일식292
214442024032510410편의점237
18112024030710113통닭(치킨)81
4272024030610413전통찻집10
255952024032910115김밥(도시락)6
65442024031110102중국식336
266952024033010104일식101
처리일자청소년유해업소업종코드청소년유해업소업종명건수
63242024031110210간이주점3
207472024032410411패스트푸드77
68452024031210411패스트푸드50
195122024032310105분식227
284382024033110206스텐드바1
103322024031510301단란주점31
168432024032110104일식147
257772024032910104일식113
308942024040210101한식3223
96652024031410403일반조리판매180

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
1162024030924201비디오물감상실업16
5032024032310201카바레26
1372024031010413전통찻집15
1942024031210408철도역구내15
2312024031320399이용업 기타15
2342024031324201비디오물감상실업15
3222024031710201카바레25
4292024032024201비디오물감상실업15
5292024032410202고고(디스코)클럽15
5362024032410409백화점15