Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows900
Duplicate rows (%)9.0%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author노원구
URLhttps://data.seoul.go.kr/dataList/OA-10963/S/1/datasetView.do

Alerts

Dataset has 900 (9.0%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:07:43.754433
Analysis finished2024-05-11 16:07:49.213229
Duration5.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240448
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:07:49.415361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.233751
Coefficient of variation (CV)1.9877895 × 10-6
Kurtosis-1.456107
Mean20240448
Median Absolute Deviation (MAD)9
Skewness0.68995353
Sum2.0240448 × 1011
Variance1618.7548
MonotonicityNot monotonic
2024-05-12T01:07:49.807944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240426 366
 
3.7%
20240502 361
 
3.6%
20240417 354
 
3.5%
20240414 353
 
3.5%
20240425 352
 
3.5%
20240424 351
 
3.5%
20240415 349
 
3.5%
20240418 346
 
3.5%
20240510 346
 
3.5%
20240427 341
 
3.4%
Other values (20) 6481
64.8%
ValueCountFrequency (%)
20240411 335
3.4%
20240412 317
3.2%
20240413 319
3.2%
20240414 353
3.5%
20240415 349
3.5%
20240416 320
3.2%
20240417 354
3.5%
20240418 346
3.5%
20240419 321
3.2%
20240420 326
3.3%
ValueCountFrequency (%)
20240510 346
3.5%
20240509 333
3.3%
20240508 329
3.3%
20240507 326
3.3%
20240506 309
3.1%
20240505 295
2.9%
20240504 326
3.3%
20240503 340
3.4%
20240502 361
3.6%
20240501 325
3.2%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12844.655
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:07:50.205784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5208.3209
Coefficient of variation (CV)0.40548546
Kurtosis1.5538384
Mean12844.655
Median Absolute Deviation (MAD)105
Skewness1.7048286
Sum1.2844655 × 108
Variance27126607
MonotonicityNot monotonic
2024-05-12T01:07:50.656060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10301 254
 
2.5%
10403 252
 
2.5%
10103 243
 
2.4%
10199 243
 
2.4%
10116 238
 
2.4%
10208 233
 
2.3%
10101 232
 
2.3%
10118 231
 
2.3%
10111 230
 
2.3%
10411 230
 
2.3%
Other values (49) 7614
76.1%
ValueCountFrequency (%)
10101 232
2.3%
10102 219
2.2%
10103 243
2.4%
10104 211
2.1%
10105 214
2.1%
10106 224
2.2%
10107 217
2.2%
10108 110
1.1%
10109 15
 
0.1%
10110 199
2.0%
ValueCountFrequency (%)
30111 139
1.4%
30110 64
 
0.6%
24205 214
2.1%
24201 182
1.8%
24113 211
2.1%
24101 91
0.9%
20399 134
1.3%
20301 226
2.3%
20199 100
1.0%
20107 204
2.0%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:07:51.438671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.233
Min length2

Characters and Unicode

Total characters42330
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반조리판매
2nd row고고(디스코)클럽
3rd row일반이용업
4th row카바레
5th row기타
ValueCountFrequency (%)
기타 885
 
8.6%
패스트푸드 460
 
4.5%
전통찻집 283
 
2.8%
단란주점 254
 
2.5%
일반조리판매 252
 
2.5%
경양식 243
 
2.4%
관광호텔 240
 
2.3%
생선회 238
 
2.3%
룸살롱 233
 
2.3%
한식 232
 
2.3%
Other values (44) 6914
67.6%
2024-05-12T01:07:52.595580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1781
 
4.2%
1595
 
3.8%
) 1205
 
2.8%
( 1205
 
2.8%
1066
 
2.5%
1032
 
2.4%
885
 
2.1%
885
 
2.1%
855
 
2.0%
797
 
1.9%
Other values (118) 31024
73.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39469
93.2%
Close Punctuation 1205
 
2.8%
Open Punctuation 1205
 
2.8%
Space Separator 234
 
0.6%
Other Punctuation 217
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1781
 
4.5%
1595
 
4.0%
1066
 
2.7%
1032
 
2.6%
885
 
2.2%
885
 
2.2%
855
 
2.2%
797
 
2.0%
717
 
1.8%
715
 
1.8%
Other values (114) 29141
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1205
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1205
100.0%
Space Separator
ValueCountFrequency (%)
234
100.0%
Other Punctuation
ValueCountFrequency (%)
, 217
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39469
93.2%
Common 2861
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1781
 
4.5%
1595
 
4.0%
1066
 
2.7%
1032
 
2.6%
885
 
2.2%
885
 
2.2%
855
 
2.2%
797
 
2.0%
717
 
1.8%
715
 
1.8%
Other values (114) 29141
73.8%
Common
ValueCountFrequency (%)
) 1205
42.1%
( 1205
42.1%
234
 
8.2%
, 217
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39469
93.2%
ASCII 2861
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1781
 
4.5%
1595
 
4.0%
1066
 
2.7%
1032
 
2.6%
885
 
2.2%
885
 
2.2%
855
 
2.2%
797
 
2.0%
717
 
1.8%
715
 
1.8%
Other values (114) 29141
73.8%
ASCII
ValueCountFrequency (%)
) 1205
42.1%
( 1205
42.1%
234
 
8.2%
, 217
 
7.6%

건수
Real number (ℝ)

Distinct732
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.4547
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:07:53.002367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3133
95-th percentile617.3
Maximum4930
Range4929
Interquartile range (IQR)129

Descriptive statistics

Standard deviation395.67139
Coefficient of variation (CV)2.4970631
Kurtosis39.292311
Mean158.4547
Median Absolute Deviation (MAD)23
Skewness5.4830728
Sum1584547
Variance156555.85
MonotonicityNot monotonic
2024-05-12T01:07:53.596020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1110
 
11.1%
2 727
 
7.3%
4 471
 
4.7%
3 391
 
3.9%
5 322
 
3.2%
7 202
 
2.0%
6 183
 
1.8%
8 179
 
1.8%
11 168
 
1.7%
9 152
 
1.5%
Other values (722) 6095
61.0%
ValueCountFrequency (%)
1 1110
11.1%
2 727
7.3%
3 391
 
3.9%
4 471
4.7%
5 322
 
3.2%
6 183
 
1.8%
7 202
 
2.0%
8 179
 
1.8%
9 152
 
1.5%
10 87
 
0.9%
ValueCountFrequency (%)
4930 1
 
< 0.1%
4926 4
< 0.1%
4925 2
< 0.1%
4922 1
 
< 0.1%
4920 1
 
< 0.1%
3228 2
< 0.1%
3227 2
< 0.1%
3225 1
 
< 0.1%
3221 2
< 0.1%
3220 1
 
< 0.1%

Interactions

2024-05-12T01:07:47.919566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:46.308827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:47.096971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:48.172025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:46.567464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:47.366400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:48.447118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:46.844504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:07:47.651412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:07:53.859499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.135
청소년유해업소업종명0.0000.9991.0000.739
건수0.0000.1350.7391.000
2024-05-12T01:07:54.108472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.0010.001
청소년유해업소업종코드0.0011.000-0.255
건수0.001-0.2551.000

Missing values

2024-05-12T01:07:48.778353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:07:49.071895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
234192024050210403일반조리판매166
75292024041710202고고(디스코)클럽4
311472024050920301일반이용업89
241022024050210201카바레4
230492024050110299기타2
144402024042410403일반조리판매168
97592024041910117까페7
274192024050510407유원지1
254262024050410115김밥(도시락)8
163582024042510301단란주점92
처리일자청소년유해업소업종코드청소년유해업소업종명건수
258222024050410112호프(소주방)398
226402024050124113일반게임제공업1
7872024041110107정종,대포집(선술집)76
229892024050124113일반게임제공업8
103332024042020301일반이용업122
26472024041310110출장조리5
26652024041310403일반조리판매98
147542024042420399이용업 기타1
232482024050220107여인숙업11
28022024041310119탕류1

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
8122024050810202고고(디스코)클럽16
8972024051020399이용업 기타16
522024041210413전통찻집15
1542024041524201비디오물감상실업15
1632024041610201카바레25
3362024042210119탕류15
3642024042310201카바레25
4062024042410409백화점15
4552024042610201카바레25
5662024042924201비디오물감상실업15