Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows911
Duplicate rows (%)9.1%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author송파구
URLhttps://data.seoul.go.kr/dataList/OA-11425/S/1/datasetView.do

Alerts

Dataset has 911 (9.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:49:34.597929
Analysis finished2024-05-11 15:49:40.117553
Duration5.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240448
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:40.325843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.143938
Coefficient of variation (CV)1.9833522 × 10-6
Kurtosis-1.4291843
Mean20240448
Median Absolute Deviation (MAD)9
Skewness0.70865617
Sum2.0240448 × 1011
Variance1611.5358
MonotonicityNot monotonic
2024-05-12T00:49:40.730276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240422 383
 
3.8%
20240425 366
 
3.7%
20240430 358
 
3.6%
20240423 355
 
3.5%
20240415 352
 
3.5%
20240508 350
 
3.5%
20240427 346
 
3.5%
20240510 341
 
3.4%
20240419 339
 
3.4%
20240418 337
 
3.4%
Other values (20) 6473
64.7%
ValueCountFrequency (%)
20240411 330
3.3%
20240412 337
3.4%
20240413 330
3.3%
20240414 327
3.3%
20240415 352
3.5%
20240416 332
3.3%
20240417 331
3.3%
20240418 337
3.4%
20240419 339
3.4%
20240420 313
3.1%
ValueCountFrequency (%)
20240510 341
3.4%
20240509 333
3.3%
20240508 350
3.5%
20240507 317
3.2%
20240506 325
3.2%
20240505 303
3.0%
20240504 328
3.3%
20240503 317
3.2%
20240502 324
3.2%
20240501 312
3.1%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12921.426
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:41.140041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5218.5578
Coefficient of variation (CV)0.40386856
Kurtosis1.3178832
Mean12921.426
Median Absolute Deviation (MAD)107
Skewness1.6360906
Sum1.2921426 × 108
Variance27233346
MonotonicityNot monotonic
2024-05-12T00:49:41.602129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20105 244
 
2.4%
10104 238
 
2.4%
20301 237
 
2.4%
10301 237
 
2.4%
10412 236
 
2.4%
10411 235
 
2.4%
10112 234
 
2.3%
10116 233
 
2.3%
10208 231
 
2.3%
10102 230
 
2.3%
Other values (49) 7645
76.4%
ValueCountFrequency (%)
10101 211
2.1%
10102 230
2.3%
10103 226
2.3%
10104 238
2.4%
10105 220
2.2%
10106 223
2.2%
10107 216
2.2%
10108 140
1.4%
10109 20
 
0.2%
10110 202
2.0%
ValueCountFrequency (%)
30111 128
1.3%
30110 64
 
0.6%
24205 218
2.2%
24201 186
1.9%
24113 213
2.1%
24101 75
 
0.8%
20399 160
1.6%
20301 237
2.4%
20199 96
1.0%
20107 206
2.1%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:49:42.380852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2701
Min length2

Characters and Unicode

Total characters42701
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무도학원업
2nd row식육취급
3rd row일식
4th row통닭(치킨)
5th row출장조리
ValueCountFrequency (%)
기타 823
 
8.0%
패스트푸드 442
 
4.3%
전통찻집 342
 
3.3%
관광호텔 271
 
2.6%
여관업 244
 
2.4%
일식 238
 
2.3%
단란주점 237
 
2.3%
일반이용업 237
 
2.3%
커피숍 236
 
2.3%
호프(소주방 234
 
2.3%
Other values (44) 6952
67.8%
2024-05-12T00:49:43.544606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1827
 
4.3%
1577
 
3.7%
( 1247
 
2.9%
) 1247
 
2.9%
1098
 
2.6%
988
 
2.3%
860
 
2.0%
823
 
1.9%
823
 
1.9%
820
 
1.9%
Other values (118) 31391
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39735
93.1%
Open Punctuation 1247
 
2.9%
Close Punctuation 1247
 
2.9%
Space Separator 256
 
0.6%
Other Punctuation 216
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1827
 
4.6%
1577
 
4.0%
1098
 
2.8%
988
 
2.5%
860
 
2.2%
823
 
2.1%
823
 
2.1%
820
 
2.1%
782
 
2.0%
774
 
1.9%
Other values (114) 29363
73.9%
Open Punctuation
ValueCountFrequency (%)
( 1247
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1247
100.0%
Space Separator
ValueCountFrequency (%)
256
100.0%
Other Punctuation
ValueCountFrequency (%)
, 216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39735
93.1%
Common 2966
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1827
 
4.6%
1577
 
4.0%
1098
 
2.8%
988
 
2.5%
860
 
2.2%
823
 
2.1%
823
 
2.1%
820
 
2.1%
782
 
2.0%
774
 
1.9%
Other values (114) 29363
73.9%
Common
ValueCountFrequency (%)
( 1247
42.0%
) 1247
42.0%
256
 
8.6%
, 216
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39735
93.1%
ASCII 2966
 
6.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1827
 
4.6%
1577
 
4.0%
1098
 
2.8%
988
 
2.5%
860
 
2.2%
823
 
2.1%
823
 
2.1%
820
 
2.1%
782
 
2.0%
774
 
1.9%
Other values (114) 29363
73.9%
ASCII
ValueCountFrequency (%)
( 1247
42.0%
) 1247
42.0%
256
 
8.6%
, 216
 
7.3%

건수
Real number (ℝ)

Distinct719
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.0759
Minimum1
Maximum4926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:43.958987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median23
Q3131
95-th percentile585
Maximum4926
Range4925
Interquartile range (IQR)127

Descriptive statistics

Standard deviation378.44631
Coefficient of variation (CV)2.4722788
Kurtosis37.132872
Mean153.0759
Median Absolute Deviation (MAD)22
Skewness5.3782384
Sum1530759
Variance143221.61
MonotonicityNot monotonic
2024-05-12T00:49:44.562248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1152
 
11.5%
2 724
 
7.2%
4 493
 
4.9%
3 403
 
4.0%
5 333
 
3.3%
7 204
 
2.0%
8 202
 
2.0%
6 177
 
1.8%
11 146
 
1.5%
15 130
 
1.3%
Other values (709) 6036
60.4%
ValueCountFrequency (%)
1 1152
11.5%
2 724
7.2%
3 403
 
4.0%
4 493
4.9%
5 333
 
3.3%
6 177
 
1.8%
7 204
 
2.0%
8 202
 
2.0%
9 112
 
1.1%
10 79
 
0.8%
ValueCountFrequency (%)
4926 2
< 0.1%
4924 1
 
< 0.1%
4923 1
 
< 0.1%
4922 1
 
< 0.1%
3228 1
 
< 0.1%
3227 1
 
< 0.1%
3225 1
 
< 0.1%
3221 3
< 0.1%
3220 2
< 0.1%
3219 2
< 0.1%

Interactions

2024-05-12T00:49:38.807905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:37.175586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:37.974616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:39.063792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:37.437246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:38.245460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:39.342745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:37.718538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:38.534581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:49:44.826326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.132
청소년유해업소업종명0.0000.9991.0000.741
건수0.0000.1320.7411.000
2024-05-12T00:49:45.079043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.013-0.009
청소년유해업소업종코드0.0131.000-0.268
건수-0.009-0.2681.000

Missing values

2024-05-12T00:49:39.676475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:49:39.975557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
134732024042830111무도학원업3
18952024050910118식육취급27
132152024042810104일식604
68432024050410113통닭(치킨)118
295342024041410110출장조리2
131272024042910101한식1532
136432024042810116생선회3
302572024041310102중국식408
269872024041620301일반이용업68
157692024042610111패스트푸드19
처리일자청소년유해업소업종코드청소년유해업소업종명건수
239892024041910111패스트푸드12
157102024042610410편의점226
31772024050810112호프(소주방)290
46692024050610117까페35
177802024042410410편의점196
7672024051010402다방122
191602024042310412커피숍517
165202024042524113일반게임제공업4
172122024042530111무도학원업3
244402024041810103경양식2208

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
2332024041824201비디오물감상실업16
7122024050420399이용업 기타16
8782024050924201비디오물감상실업16
562024041310201카바레25
722024041320399이용업 기타15
992024041410413전통찻집15
2432024041910202고고(디스코)클럽15
2862024042020399이용업 기타15
3642024042310201카바레25
3712024042310408철도역구내15