ResourceExhaustedError: OOM pri prideľovaní tensor s tvarom[32,32,239,239] a typu float

0

Otázka

Snažím, aby znovu CNN rozpoznávanie obrazu model z tejto knihe(model 1) pomocou rôznych obrázkov. Avšak, montáž modelu vráti mi ResourceExhaustedError v prvej doby. Na veľkosti dávky je už relatívne malý, tak hádam problém je s mojím vzorom definícia, ktorú som skopírované z papiera. Všetky rady, čo zmeniť s modelom ocenia. Ďakujeme!

#Load dataset
BATCH_SIZE = 32
IMG_SIZE = (244,244)
train_set = tf.keras.preprocessing.image_dataset_from_directory(
    main_dir, 
    shuffle = True,
    image_size = IMG_SIZE,
    batch_size = BATCH_SIZE)
val_set = tf.keras.preprocessing.image_dataset_from_directory(
    main_dir, 
    shuffle = True, 
    image_size = IMG_SIZE,
    batch_size = BATCH_SIZE)
class_names = train_set.class_names
print(class_names)

#Augment data by flipping image and random rotation
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.RandomFlip('horizontal'),
    tf.keras.layers.experimental.preprocessing.RandomRotation(0.2),
])

#Model definition 
model = Sequential([
    data_augmentation,
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255),
    Conv2D(filters=64,kernel_size=(4,4), activation='relu'),
    Conv2D(filters=32,kernel_size=(3,3), activation='relu'),
    AveragePooling2D(pool_size=(4,4)),

    Conv2D(filters=32,kernel_size=(3,3), activation='relu'),
    Conv2D(filters=32,kernel_size=(3,3), activation='relu'),
    Conv2D(filters=32,kernel_size=(3,3), activation='relu'),
    AveragePooling2D(pool_size=(2,2)),
    Flatten(),
    
    Dense(256, activation='relu'),
    Dense(256, activation='relu'),
    Dense(128, activation='relu'),
    Dense(128, activation='relu'),
    Dense(128, activation='tanh'),
    Dense(1, activation='softmax')

])

model.compile(optimizer='RMSprop',
              loss=keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=[keras.metrics.CategoricalAccuracy()])

history = model.fit(train_set,validation_data=val_set, epochs=150)

Chyba po montáž modelu:

ResourceExhaustedError:  OOM when allocating tensor with shape[32,32,239,239] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
     [[node gradient_tape/sequential_1/average_pooling2d/AvgPoolGrad (defined at <ipython-input-10-ef749d320491>:1) ]]

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce 940MX       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   46C    P0    N/A /  N/A |   1938MiB /  2004MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       959      G   /usr/lib/xorg/Xorg                 97MiB |
|    0   N/A  N/A      1270      G   /usr/bin/gnome-shell               25MiB |
|    0   N/A  N/A      4635      G   /usr/lib/firefox/firefox          212MiB |
|    0   N/A  N/A      5843      C   /usr/bin/python3                 1595MiB |
+-----------------------------------------------------------------------------+

1

Najlepšiu odpoveď

0

Chyba je spôsobené tým, že váš GPU uteká z pamäte. Môže to byť buď preto, 1) ste zaťaženie príliš veľa dát na jednu epochu aj pre vaše GPU, alebo 2) ak sa stane, že majú dosť VRAM, iný proces si vyhradzuje niektoré pamäťové medzi obdobiami. Môže sa to stať, pretože tensorflow si vyhradzuje 100% VRAM pri behu. Môžete obmedziť množstvo VRAM vyhradené len to, čo je potrebné.

import tensorflow as tf
for gpu in tf.config.list_physical_devices("GPU")
    tf.config.experimental.set_memory_growth(gpu, True)

Edit: pri Pohľade na grafickú kartu, som si takmer istý, že nemáte dostatok VRAM pre váš veľkosti dávky a veľkosť textu (takže problém 1). Mali by ste znížiť veľkosti dávky.

2021-11-23 17:57:42

V iných jazykoch

Táto stránka je v iných jazykoch

Русский
..................................................................................................................
Italiano
..................................................................................................................
Polski
..................................................................................................................
Română
..................................................................................................................
한국어
..................................................................................................................
हिन्दी
..................................................................................................................
Français
..................................................................................................................
Türk
..................................................................................................................
Česk
..................................................................................................................
Português
..................................................................................................................
ไทย
..................................................................................................................
中文
..................................................................................................................
Español
..................................................................................................................