Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows912
Duplicate rows (%)9.1%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author성동구
URLhttps://data.seoul.go.kr/dataList/OA-10732/S/1/datasetView.do

Alerts

Dataset has 912 (9.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:42:25.772466
Analysis finished2024-05-11 16:42:30.490786
Duration4.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:42:30.693515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.360852
Coefficient of variation (CV)1.994069 × 10-6
Kurtosis-1.4938373
Mean20240449
Median Absolute Deviation (MAD)10
Skewness0.66288608
Sum2.0240449 × 1011
Variance1628.9984
MonotonicityNot monotonic
2024-05-12T01:42:31.087913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240507 371
 
3.7%
20240430 362
 
3.6%
20240416 359
 
3.6%
20240502 353
 
3.5%
20240413 350
 
3.5%
20240419 346
 
3.5%
20240506 346
 
3.5%
20240425 345
 
3.5%
20240429 344
 
3.4%
20240501 342
 
3.4%
Other values (20) 6482
64.8%
ValueCountFrequency (%)
20240411 325
3.2%
20240412 306
3.1%
20240413 350
3.5%
20240414 329
3.3%
20240415 313
3.1%
20240416 359
3.6%
20240417 317
3.2%
20240418 320
3.2%
20240419 346
3.5%
20240420 335
3.4%
ValueCountFrequency (%)
20240510 314
3.1%
20240509 329
3.3%
20240508 313
3.1%
20240507 371
3.7%
20240506 346
3.5%
20240505 331
3.3%
20240504 327
3.3%
20240503 321
3.2%
20240502 353
3.5%
20240501 342
3.4%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12934.432
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:42:31.491358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5248.5014
Coefficient of variation (CV)0.40577751
Kurtosis1.2675857
Mean12934.432
Median Absolute Deviation (MAD)107
Skewness1.6301653
Sum1.2934432 × 108
Variance27546767
MonotonicityNot monotonic
2024-05-12T01:42:31.942681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24205 254
 
2.5%
10101 254
 
2.5%
10412 243
 
2.4%
10208 241
 
2.4%
10116 236
 
2.4%
10199 233
 
2.3%
20101 232
 
2.3%
10104 232
 
2.3%
10107 232
 
2.3%
10410 230
 
2.3%
Other values (49) 7613
76.1%
ValueCountFrequency (%)
10101 254
2.5%
10102 211
2.1%
10103 225
2.2%
10104 232
2.3%
10105 218
2.2%
10106 222
2.2%
10107 232
2.3%
10108 127
1.3%
10109 24
 
0.2%
10110 197
2.0%
ValueCountFrequency (%)
30111 133
1.3%
30110 56
 
0.6%
24205 254
2.5%
24201 210
2.1%
24113 213
2.1%
24101 71
 
0.7%
20399 158
1.6%
20301 225
2.2%
20199 88
 
0.9%
20107 185
1.8%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:42:32.721378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2169
Min length2

Characters and Unicode

Total characters42169
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노래연습장업
2nd row카바레
3rd row관광호텔
4th row백화점
5th row식육취급
ValueCountFrequency (%)
기타 866
 
8.5%
패스트푸드 426
 
4.2%
전통찻집 325
 
3.2%
관광호텔 282
 
2.8%
노래연습장업 254
 
2.5%
한식 254
 
2.5%
커피숍 243
 
2.4%
룸살롱 241
 
2.4%
생선회 236
 
2.3%
정종,대포집(선술집 232
 
2.3%
Other values (44) 6887
67.2%
2024-05-12T01:42:33.865432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1818
 
4.3%
1595
 
3.8%
) 1173
 
2.8%
( 1173
 
2.8%
1074
 
2.5%
1000
 
2.4%
866
 
2.1%
866
 
2.1%
842
 
2.0%
789
 
1.9%
Other values (118) 30973
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39345
93.3%
Close Punctuation 1173
 
2.8%
Open Punctuation 1173
 
2.8%
Space Separator 246
 
0.6%
Other Punctuation 232
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1818
 
4.6%
1595
 
4.1%
1074
 
2.7%
1000
 
2.5%
866
 
2.2%
866
 
2.2%
842
 
2.1%
789
 
2.0%
747
 
1.9%
721
 
1.8%
Other values (114) 29027
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1173
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1173
100.0%
Space Separator
ValueCountFrequency (%)
246
100.0%
Other Punctuation
ValueCountFrequency (%)
, 232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39345
93.3%
Common 2824
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1818
 
4.6%
1595
 
4.1%
1074
 
2.7%
1000
 
2.5%
866
 
2.2%
866
 
2.2%
842
 
2.1%
789
 
2.0%
747
 
1.9%
721
 
1.8%
Other values (114) 29027
73.8%
Common
ValueCountFrequency (%)
) 1173
41.5%
( 1173
41.5%
246
 
8.7%
, 232
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39345
93.3%
ASCII 2824
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1818
 
4.6%
1595
 
4.1%
1074
 
2.7%
1000
 
2.5%
866
 
2.2%
866
 
2.2%
842
 
2.1%
789
 
2.0%
747
 
1.9%
721
 
1.8%
Other values (114) 29027
73.8%
ASCII
ValueCountFrequency (%)
) 1173
41.5%
( 1173
41.5%
246
 
8.7%
, 232
 
8.2%

건수
Real number (ℝ)

Distinct746
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165.8995
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:42:34.270094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3141
95-th percentile764
Maximum4930
Range4929
Interquartile range (IQR)137

Descriptive statistics

Standard deviation414.04986
Coefficient of variation (CV)2.4957873
Kurtosis40.118133
Mean165.8995
Median Absolute Deviation (MAD)23
Skewness5.4936683
Sum1658995
Variance171437.29
MonotonicityNot monotonic
2024-05-12T01:42:34.859413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1095
 
10.9%
2 744
 
7.4%
4 474
 
4.7%
3 401
 
4.0%
5 325
 
3.2%
8 202
 
2.0%
7 184
 
1.8%
6 169
 
1.7%
11 168
 
1.7%
9 150
 
1.5%
Other values (736) 6088
60.9%
ValueCountFrequency (%)
1 1095
10.9%
2 744
7.4%
3 401
 
4.0%
4 474
4.7%
5 325
 
3.2%
6 169
 
1.7%
7 184
 
1.8%
8 202
 
2.0%
9 150
 
1.5%
10 90
 
0.9%
ValueCountFrequency (%)
4930 1
 
< 0.1%
4929 1
 
< 0.1%
4926 3
< 0.1%
4925 1
 
< 0.1%
4924 1
 
< 0.1%
4923 1
 
< 0.1%
4922 1
 
< 0.1%
4921 1
 
< 0.1%
4920 1
 
< 0.1%
4916 2
< 0.1%

Interactions

2024-05-12T01:42:29.606743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:28.339981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:29.077490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:29.760674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:28.598320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:29.250738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:29.936370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:28.875643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:42:29.437688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:42:35.119055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0310.0000.000
청소년유해업소업종코드0.0311.0000.9990.141
청소년유해업소업종명0.0000.9991.0000.740
건수0.0000.1410.7401.000
2024-05-12T01:42:35.367444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.004-0.005
청소년유해업소업종코드0.0041.000-0.264
건수-0.005-0.2641.000

Missing values

2024-05-12T01:42:30.130494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:42:30.351385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
294402024050724205노래연습장업75
13352024041210201카바레5
183232024042720101관광호텔87
127652024042210409백화점24
66912024041710118식육취급13
194522024042820105여관업127
290622024050710406고속도로2
299572024050830111무도학원업4
134532024042310403일반조리판매168
217062024043010401과자점12
처리일자청소년유해업소업종코드청소년유해업소업종명건수
236782024050210403일반조리판매246
183782024042710112호프(소주방)247
242702024050310101한식1081
268062024050510207비어(바)살롱6
195062024042810102중국식126
18992024041224201비디오물감상실업1
191592024042810411패스트푸드29
203612024042910409백화점56
167152024042610401과자점9
117332024042110106뷔페식6

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
6042024043010413전통찻집17
6712024050224201비디오물감상실업17
4712024042610206스텐드바16
6642024050210413전통찻집16
8202024050724201비디오물감상실업16
8712024050920399이용업 기타16
652024041310206스텐드바15
832024041320399이용업 기타15
1062024041424201비디오물감상실업15
2662024041924201비디오물감상실업15