Comments (5)
Here is another update with the performance improvements using the proposed changes above.
Depending on the GPU used for training, in this case using Titan X (Pascal), the time to train 10 epochs
Original select_triplets method
Total time to train 10 epochs = 2h 10m = 2*60 +10 = ~ 130 min
Time per epoch = 1300/10 = 13 m
With the proposed enhancement (one training session only)
Total time to train 10 epochs = 45 m
Time per epoch = 45/10 = 4.5 m
Speed up about 2.89x
13 / 4.5 = ~2.89
YMMV. It's nice to see this minor change can have this level of performance improvement,
from facenet.
Currently none of the above bullets are handled.
Is there a good way to compare the identities between the datasets? Or does it include a lot of manual work? For the MsCeleb dataset the MIDs from freebase has been used to get unique identities for the classes, and it would be good to have the same for the other datasets as well.
from facenet.
I use a simple Java program to merge the Casia and FaceScrub datasets and print out the classes/sub directories that have the same name. I validated a dozen of these merged classes and it seems to no problem at all. Imagine that celebraties have the same name? It would be very rare right! Let go with this assumption and simply merge the data sets together. One of the reason for doing that is because of the triplet selection for the semi-hard negative. Accoording to the algorithm, if the same person/class is selected ramdonly from the other dataset of the same person, it is very likely that the image will end up been selected as the semi-hard negative. Which is totally make the training going in the wrong direction.
Based on the same assumption (the sub-directory name in the dataset is unique among the datasets), All the classes exist in LFW will be removed from the merged dataset for training. This is how the merge dataset is been processed.
I have two training sessions that are running concurrently on the same machine, here are my observations so far:
- Compared with the chard you posted on wiki, and with my own training results using the original datasets, the LFW chart is smoother or less fluctuations (using Tensorboard with smoothing factor=0).
- LFW accuracy curve seems to be the same and I am currently waiting for the training to reach 250 epochs to compare the results from my previous run.
It seems that the "select_triplets" routine is utilizing more than 60% of the training time? Do you have any plan on improving the performance of this routine? Based on my initial testing, this following idea seems to be working faster.
#
num_per_class =45
people_per_batch = 40
num_of_images = num_per_class * people_per_batch
# create a random array with values between 0..1
embeddings = np.random.random((num_of_images, 128)).astype('f')
# Create an array to save the distances
dists = np.zeros((num_of_images, num_of_images)).astype('f')
maxfloat = np.finfo(np.float32).max
# scalar distance between embeddings
for i in np.arange(0,num_of_images):
dists[i] = np.sum(np.square(np.subtract(embeddings, embeddings[i])), 1)
# fill the diagnonal with max float 32 value to prevent it from been selected as negative
np.fill_diagonal(dists, maxfloat)
# Get the pos_dist out of the array and then fill them with maxfloat32 to prevent
# them from been selected as negative
# ...
# find the argmin as the index for the negative
idx_semi_hard_negs = np.argmin(dists, 1)
# continue with the rest of the select_triplets routine.
# ...
Thanks,
from facenet.
Hi David,
I tried the performance improvement idea and it actually works pretty well. The triplet selection time has been improved from 10+ seconds (time/selection on the tensorboard) to less than 3 seconds. I believe further improvement is possible on these for loops. The key idea is that the distance matrix is pre-calculated in advance and reference the distance value in the for loops. I validated the result and the implementation should be correct.
def select_triplets(embeddings, num_per_class, image_data, people_per_batch, alpha):
def dist(emb1, emb2):
x = np.square(np.subtract(emb1, emb2))
return np.sum(x, 0)
nrof_images = image_data.shape[0]
# distance matrix
dists = np.zeros((nrof_images, nrof_images))
# pre-calculate the distance matrix
for i in np.arange(0, nrof_images):
dists[i] = np.sum(np.square(np.subtract(embeddings, embeddings[i])), 1)
nrof_triplets = nrof_images - people_per_batch
shp = [nrof_triplets, image_data.shape[1], image_data.shape[2], image_data.shape[3]]
as_arr = np.zeros(shp)
ps_arr = np.zeros(shp)
ns_arr = np.zeros(shp)
trip_idx = 0
shuffle = np.arange(nrof_triplets)
np.random.shuffle(shuffle)
emb_start_idx = 0
nrof_random_negs = 0
for i in xrange(people_per_batch):
n = num_per_class[i]
for j in range(1,n):
a_idx = emb_start_idx
p_idx = emb_start_idx + j
as_arr[shuffle[trip_idx]] = image_data[a_idx]
ps_arr[shuffle[trip_idx]] = image_data[p_idx]
# Select a semi-hard negative that has a distance
# further away from the positive exemplar.
# pos_dist = dist(embeddings[a_idx][:], embeddings[p_idx][:])
pos_dist = dists[a_idx, p_idx]
# delta = pos_dist - dists[a_idx, p_idx]
#if np.abs(delta) > 0.0001:
# print('pos_dist=%.3f - %.3f' % (pos_dist, dists[a_idx, p_idx]))
sel_neg_idx = emb_start_idx
while sel_neg_idx>=emb_start_idx and sel_neg_idx<=emb_start_idx+n-1:
sel_neg_idx = (np.random.randint(1, 2**32) % nrof_images) - 1
#sel_neg_idx = np.random.random_integers(0, nrof_images-1)
# sel_neg_dist = dist(embeddings[a_idx][:], embeddings[sel_neg_idx][:])
sel_neg_dist = dists[a_idx, sel_neg_idx]
# delta = sel_neg_dist - dists[a_idx, sel_neg_idx]
# if np.abs(delta) > 0.0001:
# print('sel_neg_dist=%.3f - %.3f' % (sel_neg_dist, dists[a_idx, sel_neg_idx]))
random_neg = True
for k in range(nrof_images):
if k<emb_start_idx or k>emb_start_idx+n-1:
# neg_dist = dist(embeddings[a_idx][:], embeddings[k][:])
neg_dist = dists[a_idx, k]
# delta = neg_dist - dists[a_idx, k]
# if np.abs(delta) > 0.0001:
# print('pos_dist=%.3f - %.3f' % (neg_dist, dists[a_idx, k]))
if pos_dist<neg_dist and neg_dist<sel_neg_dist and np.abs(pos_dist-neg_dist)<alpha:
random_neg = False
sel_neg_dist = neg_dist
sel_neg_idx = k
if random_neg:
nrof_random_negs += 1
ns_arr[shuffle[trip_idx]] = image_data[sel_neg_idx]
#print('Triplet %d: (%d, %d, %d), pos_dist=%2.3f, neg_dist=%2.3f, sel_neg_dist=%2.3f' % (trip_idx, a_idx, p_idx, sel_neg_idx, pos_dist, neg_dist, sel_neg_dist))
trip_idx += 1
emb_start_idx += n
triplets = (as_arr, ps_arr, ns_arr)
return triplets, nrof_random_negs, nrof_triplets
from facenet.
That is a very good speed-up indeed!! If you could make a pull request out of it I would be happy to merge it!
Lately I have been running the training as a classifier and it has worked out pretty well. But I like the triplet loss and I think it can be used for finetuning of the network also when training is started in classifier mode.
For the overlapping identities issue I think the most elegant solution would be to do the "filtering" when parsing through the dataset. But it will require some python hacking...
I guess there could also be classes from different datasets belonging to the same identity but due to differencies in spelling they are treated as different identities. But I'm not sure that it would be a big issue though.
from facenet.
Related Issues (20)
- command not found error (while trying align the LFW dataset)
- ValueError: Node 'gradients/InceptionResnetV1/Bottleneck/BatchNorm/cond/FusedBatchNorm_1_grad/FusedBatchNormGrad' has an _output_shapes attribute inconsistent with the GraphDef for output #3: Dimension 0 in both shapes must be equal, but are 0 and 512. Shapes are [0] and [512]. HOT 4
- TypeError: true_fn and false_fn arguments to tf.cond must have the same number, type, and overall structure of return values
- does facenet support docker HOT 1
- About tensorflow -v2.8.0, I have a error. HOT 1
- Validation loss
- ve
- How to make inference on a single image? HOT 1
- Bounding box is inaccurate HOT 2
- Error in Loading checkpoint file for facenet512
- Unable to open file (file signature not found) HOT 1
- CASIA Webface Dataset Link Needs Updating
- Incorrect bounding box
- Issue with Tensor Names in DeepSORT Integration with FaceNET Model
- Unable to use .pb in tensorflow's java api
- Unable to convert onnx model to TRT model
- ValueError: Node 'gradients/InceptionResnetV1/Bottleneck/BatchNorm/cond/FusedBatchNorm_1_grad/FusedBatchNormGrad' has an _output_shapes attribute inconsistent with the GraphDef for output #3: Dimension 0 in both shapes must be equal, but are 0 and 512. Shapes are [0] and [512]. HOT 1
- Request for Weight Files in faceswap-GAN Project
- How to add visualization to train_tripletloss.py
- Find
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from facenet.