Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows920
Duplicate rows (%)9.2%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author양천구
URLhttps://data.seoul.go.kr/dataList/OA-10809/S/1/datasetView.do

Alerts

Dataset has 920 (9.2%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:28:15.984342
Analysis finished2024-05-11 16:28:21.540071
Duration5.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240448
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:21.679795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.299206
Coefficient of variation (CV)1.9910234 × 10-6
Kurtosis-1.4537502
Mean20240448
Median Absolute Deviation (MAD)9
Skewness0.69140701
Sum2.0240448 × 1011
Variance1624.026
MonotonicityNot monotonic
2024-05-12T01:28:21.906522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240413 364
 
3.6%
20240429 361
 
3.6%
20240418 353
 
3.5%
20240510 353
 
3.5%
20240428 352
 
3.5%
20240509 351
 
3.5%
20240425 346
 
3.5%
20240419 345
 
3.5%
20240411 344
 
3.4%
20240415 344
 
3.4%
Other values (20) 6487
64.9%
ValueCountFrequency (%)
20240411 344
3.4%
20240412 324
3.2%
20240413 364
3.6%
20240414 330
3.3%
20240415 344
3.4%
20240416 331
3.3%
20240417 324
3.2%
20240418 353
3.5%
20240419 345
3.5%
20240420 331
3.3%
ValueCountFrequency (%)
20240510 353
3.5%
20240509 351
3.5%
20240508 334
3.3%
20240507 327
3.3%
20240506 337
3.4%
20240505 303
3.0%
20240504 315
3.1%
20240503 319
3.2%
20240502 317
3.2%
20240501 330
3.3%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12973.293
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:22.148749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5277.9828
Coefficient of variation (CV)0.40683447
Kurtosis1.2365741
Mean12973.293
Median Absolute Deviation (MAD)107
Skewness1.6157785
Sum1.2973293 × 108
Variance27857102
MonotonicityNot monotonic
2024-05-12T01:28:22.418043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10403 245
 
2.5%
20301 243
 
2.4%
10117 242
 
2.4%
10112 236
 
2.4%
10402 236
 
2.4%
10102 234
 
2.3%
20105 234
 
2.3%
24205 233
 
2.3%
10301 233
 
2.3%
10106 230
 
2.3%
Other values (49) 7634
76.3%
ValueCountFrequency (%)
10101 209
2.1%
10102 234
2.3%
10103 230
2.3%
10104 217
2.2%
10105 225
2.2%
10106 230
2.3%
10107 206
2.1%
10108 120
1.2%
10109 17
 
0.2%
10110 178
1.8%
ValueCountFrequency (%)
30111 132
1.3%
30110 69
 
0.7%
24205 233
2.3%
24201 196
2.0%
24113 222
2.2%
24101 80
 
0.8%
20399 148
1.5%
20301 243
2.4%
20199 104
1.0%
20107 204
2.0%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:28:23.103384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2364
Min length2

Characters and Unicode

Total characters42364
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row백화점
2nd row단란주점
3rd row패스트푸드
4th row식육취급
5th row간이주점
ValueCountFrequency (%)
기타 861
 
8.4%
패스트푸드 447
 
4.4%
전통찻집 306
 
3.0%
관광호텔 267
 
2.6%
일반조리판매 245
 
2.4%
일반이용업 243
 
2.4%
까페 242
 
2.4%
다방 236
 
2.3%
호프(소주방 236
 
2.3%
여관업 234
 
2.3%
Other values (44) 6935
67.6%
2024-05-12T01:28:23.979593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1865
 
4.4%
1571
 
3.7%
) 1185
 
2.8%
( 1185
 
2.8%
1108
 
2.6%
978
 
2.3%
891
 
2.1%
861
 
2.0%
861
 
2.0%
766
 
1.8%
Other values (118) 31093
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39536
93.3%
Close Punctuation 1185
 
2.8%
Open Punctuation 1185
 
2.8%
Space Separator 252
 
0.6%
Other Punctuation 206
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1865
 
4.7%
1571
 
4.0%
1108
 
2.8%
978
 
2.5%
891
 
2.3%
861
 
2.2%
861
 
2.2%
766
 
1.9%
758
 
1.9%
718
 
1.8%
Other values (114) 29159
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1185
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1185
100.0%
Space Separator
ValueCountFrequency (%)
252
100.0%
Other Punctuation
ValueCountFrequency (%)
, 206
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39536
93.3%
Common 2828
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1865
 
4.7%
1571
 
4.0%
1108
 
2.8%
978
 
2.5%
891
 
2.3%
861
 
2.2%
861
 
2.2%
766
 
1.9%
758
 
1.9%
718
 
1.8%
Other values (114) 29159
73.8%
Common
ValueCountFrequency (%)
) 1185
41.9%
( 1185
41.9%
252
 
8.9%
, 206
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39536
93.3%
ASCII 2828
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1865
 
4.7%
1571
 
4.0%
1108
 
2.8%
978
 
2.5%
891
 
2.3%
861
 
2.2%
861
 
2.2%
766
 
1.9%
758
 
1.9%
718
 
1.8%
Other values (114) 29159
73.8%
ASCII
ValueCountFrequency (%)
) 1185
41.9%
( 1185
41.9%
252
 
8.9%
, 206
 
7.3%

건수
Real number (ℝ)

Distinct736
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.5227
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:28:24.223080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median23
Q3132
95-th percentile573
Maximum4930
Range4929
Interquartile range (IQR)128

Descriptive statistics

Standard deviation381.74367
Coefficient of variation (CV)2.4865617
Kurtosis41.64266
Mean153.5227
Median Absolute Deviation (MAD)22
Skewness5.6235295
Sum1535227
Variance145728.23
MonotonicityNot monotonic
2024-05-12T01:28:24.655460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1137
 
11.4%
2 699
 
7.0%
4 505
 
5.1%
3 399
 
4.0%
5 329
 
3.3%
7 191
 
1.9%
8 179
 
1.8%
6 171
 
1.7%
11 168
 
1.7%
9 149
 
1.5%
Other values (726) 6073
60.7%
ValueCountFrequency (%)
1 1137
11.4%
2 699
7.0%
3 399
 
4.0%
4 505
5.1%
5 329
 
3.3%
6 171
 
1.7%
7 191
 
1.9%
8 179
 
1.8%
9 149
 
1.5%
10 93
 
0.9%
ValueCountFrequency (%)
4930 1
< 0.1%
4926 1
< 0.1%
4925 1
< 0.1%
4924 1
< 0.1%
4923 1
< 0.1%
4922 1
< 0.1%
4917 1
< 0.1%
4915 1
< 0.1%
3228 1
< 0.1%
3227 2
< 0.1%

Interactions

2024-05-12T01:28:20.223961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:18.597428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:19.395598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:20.480252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:18.858769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:19.665462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:20.760075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:19.140413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:28:19.952854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:28:24.818823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.022
청소년유해업소업종코드0.0001.0000.9990.136
청소년유해업소업종명0.0000.9991.0000.737
건수0.0220.1360.7371.000
2024-05-12T01:28:24.976861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.002-0.009
청소년유해업소업종코드0.0021.000-0.261
건수-0.009-0.2611.000

Missing values

2024-05-12T01:28:21.095173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:28:21.396965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
29512024050810409백화점1
220362024042010301단란주점72
88392024050210111패스트푸드30
190842024042310118식육취급30
271322024041610210간이주점4
43372024050720102일반호텔4
251232024041810301단란주점67
31832024050810104일식144
9012024051024201비디오물감상실업6
181742024042420301일반이용업122
처리일자청소년유해업소업종코드청소년유해업소업종명건수
152082024042710104일식145
8372024051010409백화점16
123742024042910402다방122
212702024042110299기타2
32082024050810207비어(바)살롱3
114492024043010110출장조리1
27362024050810207비어(바)살롱4
324992024041110412커피숍973
183102024042430111무도학원업4
6692024051010208룸살롱17

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
5412024042820399이용업 기타16
7872024050610413전통찻집16
802024041324201비디오물감상실업15
1232024041510201카바레25
1372024041520399이용업 기타15
1952024041710413전통찻집15
2162024041810201카바레25
2282024041810413전통찻집15
2322024041820399이용업 기타15
2622024041910413전통찻집15