로컬 데이터를 불러와 전처리시 필요한 내용이다.
Load Packages
1 2 3 4 5 6 7 8 9 10
| import os from glob import glob
import numpy as np
import tensorflow as tf from PIL import Image
import matplotlib.pyplot as plt %matplotlib inline
|
1 2 3 4 5 6 7 8 9 10
| os.getcwd()
os.listdir() os.listdir('dataset/mnist_png/training/')
glob('dataset/mnist_png/training/*.png')
|
데이터 분석
1 2 3 4 5 6
| label_nums = os.listdir('dataset/mnist_png/training/') > ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
len(label_nums) > 10
|
데이터 별 갯수 비교
1 2 3 4 5 6 7
| nums_dataset = []
for lbl_n in label_nums: data_per_class = os.listdir('../dataset/mnist_png/training/' + lbl_n) nums_dataset.append(len(data_per_class))
> [5923, 6742, 5958, 6131, 5842, 5421, 5918, 6265, 5851, 5949]
|
TensorFlow로 열기
1 2
| gfile = tf.io.read_file(path) image = tf.io.decode_image(gfile)
|
데이터 이미지 사이즈 알기
1 2 3 4 5 6 7 8 9 10 11 12
| from tqdm import tqdm_notebook
heights = [] widths = []
for path in tqdm_notebook(data_paths): image_pil = Image.open(path) image = np.array(image_pil) h, w = image.shape
heights.append(h) widths.append(w)
|