Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows934
Duplicate rows (%)9.3%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author종로구
URLhttps://data.seoul.go.kr/dataList/OA-9808/S/1/datasetView.do

Alerts

Dataset has 934 (9.3%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:52:56.153182
Analysis finished2024-05-11 15:53:01.664963
Duration5.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:53:01.869738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.460779
Coefficient of variation (CV)1.999006 × 10-6
Kurtosis-1.4930568
Mean20240449
Median Absolute Deviation (MAD)11
Skewness0.66294233
Sum2.0240449 × 1011
Variance1637.0746
MonotonicityNot monotonic
2024-05-12T00:53:02.268165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240412 367
 
3.7%
20240428 363
 
3.6%
20240415 357
 
3.6%
20240501 355
 
3.5%
20240509 355
 
3.5%
20240422 352
 
3.5%
20240508 352
 
3.5%
20240429 350
 
3.5%
20240414 348
 
3.5%
20240507 348
 
3.5%
Other values (20) 6453
64.5%
ValueCountFrequency (%)
20240411 332
3.3%
20240412 367
3.7%
20240413 328
3.3%
20240414 348
3.5%
20240415 357
3.6%
20240416 306
3.1%
20240417 316
3.2%
20240418 337
3.4%
20240419 317
3.2%
20240420 300
3.0%
ValueCountFrequency (%)
20240510 335
3.4%
20240509 355
3.5%
20240508 352
3.5%
20240507 348
3.5%
20240506 339
3.4%
20240505 313
3.1%
20240504 338
3.4%
20240503 303
3.0%
20240502 308
3.1%
20240501 355
3.5%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12886.737
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:53:02.672942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5216.7282
Coefficient of variation (CV)0.40481373
Kurtosis1.3986303
Mean12886.737
Median Absolute Deviation (MAD)105
Skewness1.66305
Sum1.2886737 × 108
Variance27214253
MonotonicityNot monotonic
2024-05-12T00:53:03.128730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10104 257
 
2.6%
10102 254
 
2.5%
20101 250
 
2.5%
10411 244
 
2.4%
10117 242
 
2.4%
10103 237
 
2.4%
10112 237
 
2.4%
10208 236
 
2.4%
20105 236
 
2.4%
10105 234
 
2.3%
Other values (49) 7573
75.7%
ValueCountFrequency (%)
10101 215
2.1%
10102 254
2.5%
10103 237
2.4%
10104 257
2.6%
10105 234
2.3%
10106 212
2.1%
10107 213
2.1%
10108 120
1.2%
10109 17
 
0.2%
10110 186
1.9%
ValueCountFrequency (%)
30111 132
1.3%
30110 61
 
0.6%
24205 233
2.3%
24201 179
1.8%
24113 219
2.2%
24101 81
 
0.8%
20399 149
1.5%
20301 214
2.1%
20199 87
 
0.9%
20107 185
1.8%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:53:03.910805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2022
Min length2

Characters and Unicode

Total characters42022
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row출장조리
2nd row일반조리판매
3rd row패스트푸드
4th row식육취급
5th row호프(소주방)
ValueCountFrequency (%)
기타 881
 
8.6%
패스트푸드 466
 
4.6%
전통찻집 308
 
3.0%
관광호텔 291
 
2.8%
일식 257
 
2.5%
중국식 254
 
2.5%
까페 242
 
2.4%
호프(소주방 237
 
2.3%
경양식 237
 
2.3%
룸살롱 236
 
2.3%
Other values (44) 6827
66.7%
2024-05-12T00:53:05.064705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1776
 
4.2%
1650
 
3.9%
) 1214
 
2.9%
( 1214
 
2.9%
1062
 
2.5%
985
 
2.3%
881
 
2.1%
881
 
2.1%
805
 
1.9%
784
 
1.9%
Other values (118) 30770
73.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39145
93.2%
Close Punctuation 1214
 
2.9%
Open Punctuation 1214
 
2.9%
Space Separator 236
 
0.6%
Other Punctuation 213
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1776
 
4.5%
1650
 
4.2%
1062
 
2.7%
985
 
2.5%
881
 
2.3%
881
 
2.3%
805
 
2.1%
784
 
2.0%
767
 
2.0%
734
 
1.9%
Other values (114) 28820
73.6%
Close Punctuation
ValueCountFrequency (%)
) 1214
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1214
100.0%
Space Separator
ValueCountFrequency (%)
236
100.0%
Other Punctuation
ValueCountFrequency (%)
, 213
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39145
93.2%
Common 2877
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1776
 
4.5%
1650
 
4.2%
1062
 
2.7%
985
 
2.5%
881
 
2.3%
881
 
2.3%
805
 
2.1%
784
 
2.0%
767
 
2.0%
734
 
1.9%
Other values (114) 28820
73.6%
Common
ValueCountFrequency (%)
) 1214
42.2%
( 1214
42.2%
236
 
8.2%
, 213
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39145
93.2%
ASCII 2877
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1776
 
4.5%
1650
 
4.2%
1062
 
2.7%
985
 
2.5%
881
 
2.3%
881
 
2.3%
805
 
2.1%
784
 
2.0%
767
 
2.0%
734
 
1.9%
Other values (114) 28820
73.6%
ASCII
ValueCountFrequency (%)
) 1214
42.2%
( 1214
42.2%
236
 
8.2%
, 213
 
7.4%

건수
Real number (ℝ)

Distinct748
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.6491
Minimum1
Maximum4926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:53:05.473703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median25
Q3141
95-th percentile608.05
Maximum4926
Range4925
Interquartile range (IQR)137

Descriptive statistics

Standard deviation392.11397
Coefficient of variation (CV)2.4872579
Kurtosis44.417409
Mean157.6491
Median Absolute Deviation (MAD)24
Skewness5.7542663
Sum1576491
Variance153753.36
MonotonicityNot monotonic
2024-05-12T00:53:06.070189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1103
 
11.0%
2 710
 
7.1%
4 494
 
4.9%
3 404
 
4.0%
5 325
 
3.2%
7 206
 
2.1%
8 173
 
1.7%
6 172
 
1.7%
15 139
 
1.4%
11 133
 
1.3%
Other values (738) 6141
61.4%
ValueCountFrequency (%)
1 1103
11.0%
2 710
7.1%
3 404
 
4.0%
4 494
4.9%
5 325
 
3.2%
6 172
 
1.7%
7 206
 
2.1%
8 173
 
1.7%
9 130
 
1.3%
10 88
 
0.9%
ValueCountFrequency (%)
4926 3
< 0.1%
4925 1
 
< 0.1%
4924 2
< 0.1%
4923 2
< 0.1%
4922 1
 
< 0.1%
4920 1
 
< 0.1%
4917 1
 
< 0.1%
4916 1
 
< 0.1%
3228 1
 
< 0.1%
3227 1
 
< 0.1%

Interactions

2024-05-12T00:53:00.365875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:52:58.746969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:52:59.539473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:53:00.620046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:52:59.007015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:52:59.808577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:53:00.897139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:52:59.285661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:53:00.095948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:53:06.332920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.021
청소년유해업소업종코드0.0001.0000.9990.138
청소년유해업소업종명0.0000.9991.0000.737
건수0.0210.1380.7371.000
2024-05-12T00:53:06.583496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.006-0.003
청소년유해업소업종코드-0.0061.000-0.277
건수-0.003-0.2771.000

Missing values

2024-05-12T00:53:01.228355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:53:01.523969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
88402024050210110출장조리6
150412024042710403일반조리판매274
198002024042210111패스트푸드26
31302024050810118식육취급27
147302024042710112호프(소주방)287
284232024041510107정종,대포집(선술집)60
220722024042020105여관업124
261382024041710101한식2651
110652024043020301일반이용업113
278702024041510106뷔페식7
처리일자청소년유해업소업종코드청소년유해업소업종명건수
247212024041810413전통찻집1
3602024051010113통닭(치킨)80
20322024050910103경양식148
239572024041920199숙박업 기타11
59522024050510412커피숍429
299022024041310499기타300
216322024042110109이동조리1
287192024041410207비어(바)살롱2
215342024042110403일반조리판매132
89732024050224113일반게임제공업14

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
4912024042620399이용업 기타16
6382024043024201비디오물감상실업16
242024041110413전통찻집15
1422024041510119탕류15
1782024041610201카바레25
2202024041720399이용업 기타15
3012024042010202고고(디스코)클럽15
3532024042210201카바레25
4702024042524201비디오물감상실업15
6602024050110413전통찻집15