face-alignment | : fire : 2D and 3D Face alignment library build using pytorch | Computer Vision library
kandi X-RAY | face-alignment Summary
kandi X-RAY | face-alignment Summary
:fire: 2D and 3D Face alignment library build using pytorch
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Gets all Landmarks from an image
- Get landmarks from image
- Transform a point
- Crops an image
- Flips the image
- Predict on the given image
- Detect the face of the given image
- Predict on a batch of images
- Runs a batch of face detection
- Compute the nms of a given threshold
- Filter a list of bounding boxes
- Detect the faces of the given image
- Convert a tensor path to a numpy array
- Find the version string
- Read the contents of a file
- Detect faces from a given image
- Load anchors from file
- Load anchors from numpy array
- Detect face from image
face-alignment Key Features
face-alignment Examples and Code Snippets
#/dev/mmcblk1 which is the sd card
UUID=ff2b8c97-7882-4967-bc94-e41ed07f3b83 /media/mendel ext4 defaults 0 2
$ cd /media/mendel
# Create a swapfile else you'll run out of memory compiling.
$ sudo mkdir swapfile
# Now let's increase the size of swap
faces_emore/
train.idx
train.rec
property
lfw.bin
cfp_fp.bin
agedb_30.bin
python tools/mx_recordio_2_ofrecord_shuffled_npart.py --data_dir datasets/faces_emore --output_filepath faces_emore/ofrecord/train --part_
import face_alignment
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from skimage import io
import collections
# Optionally set detector and some additional detector parameters
face_detector = 'sfd'
face_detector_k
def rect_to_bb(rect):
# take a bounding predicted by dlib and convert it
# to the format (x, y, w, h) as we would normally do
# with OpenCV
x = rect.left()
y = rect.top()
w = rect.right() - x
h = rect.bottom() -
!wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
!bunzip2 "shape_predictor_68_face_landmarks.dat.bz2"
jobs = glob.glob("*.jpg")
## # un-parallel
for picname in jobs:
print(picname)
aligned = FL.getAligns(picname)
def getAligns(self,
img,
use_cnn = False,
savepath = None,
return_inf
images = dlib.get_face_chips(img, faces, size=320)
image = dlib.get_face_chip(img, faces[0])
for face in faces:
x = face.left()
y = face.top() #could be face.bottom() - not sure
w = face.right() - face.left()
h = face.bottom() - face.top()
(x1, y1, w1, h1) = rect_to_bb(x,y,w,h)
# rest same as above
Community Discussions
Trending Discussions on face-alignment
QUESTION
I am very new to Torch/CUDA, and I'm trying to test the small binary network (~1.5mb) from https://github.com/1adrianb/binary-face-alignment, but I keep running into 'out of memory' issues.
I am using a relatively weak GPU (NVIDIA Quadro K600) with ~900Mb of graphics memory on 16.04 Ubuntu with CUDA 10.0 and CudNN version 5.1. So I don't really care about performance, but I thought I would at least be able to run a small network for prediction, one image at a time (especially one that supposedly is aimed at those "with Limited Resources").
I managed to run the code in headless mode and checked the memory consumption to be around 700Mb, which would explain why it fails immediately when I have an X-server running which takes around 250Mb of GPU memory.
I also added some logs to see how far along main.lua I get, and it's the call output:copy(model:forward(img))
on the very first image that runs out of memory.
For reference, here's the main.lua code up until the crash:
...ANSWER
Answered 2019-Apr-11 at 20:18What usually consumes most of the memory are the activation maps (and gradients, when training). I am not familiar with this particular model and implementation, but I would say that you are using a "fake" binary network; by fake I mean they still use floating-point numbers to represent the binary values since most users are going to use their code on GPUs that do not fully support real binary operations. The authors even write in Section 5:
Performance. In theory, by replacing all floating-point multiplications with bitwise XOR and making use of the SWAR (Single instruction, multiple data within a register) [5], [6], the number of operations can be reduced up to 32x when compared against the multiplication-based convolution. However, in our tests, we observed speedups of up to 3.5x, when compared against cuBLAS, for matrix multiplications, a result being in accordance with those reported in [6]. We note that we did not conduct experiments on CPUs. However, given the fact that we used the same method for binarization as in [5], similar improvements in terms of speed, of the order of 58x, are to be expected: as the realvalued network takes 0.67 seconds to do a forward pass on a i7-3820 using a single core, a speedup close to x58 will allow the system to run in real-time. In terms of memory compression, by removing the biases, which have minimum impact (or no impact at all) on performance, and by grouping and storing every 32 weights in one variable, we can achieve a compression rate of 39x when compared against the single precision counterpart of Torch.
In this context, a small model (w.r.t. number of parameters or model size in MiB) does not necessarily mean low memory footprint. It is likely that all this memory is being used to store the activation maps in single- or double-precision.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install face-alignment
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page