personal_site/2025-01-18-heic-depth.md at 75448efdcc4cf5d6899d86e8144e5b693475a853

mirror of https://github.com/TomHodson/tomhodson.github.com.git synced 2025-06-26 10:01:18 +02:00

Tom 75448efdcc Update 2025-01-18-heic-depth.md

2025-01-18 23:48:15 +00:00

5.8 KiB

Raw Blame History

title, layout, excerpt, assets, thumbnail, social_image, alt, head

title	layout	excerpt	assets	thumbnail	social_image	alt	head
Unexpected Depths	post	Did you know iPhone portrait mode HEIC files have a depth map in them?	/assets/blog/heic_depth_map	/assets/blog/heic_depth_map/thumbnail.png	/assets/blog/heic_depth_map/thumbnail.png	An image of the text "{...}" to suggest the idea of a template.	<script async src="/node_modules/es-module-shims/dist/es-module-shims.js"></script> <script type="importmap"> { "imports": { "three": "/node_modules/three/build/three.module.min.js", "three/addons/": "/node_modules/three/examples/jsm/", "lil-gui": "/node_modules/lil-gui/dist/lil-gui.esm.min.js" } } </script> <script src="/assets/js/projects.js" type="module"></script>

You know how iPhones do this fake depth of field effect where they blur the background? Did you know that the depth information used to do that effect is stored in the file?

# pip install pillow pillow-heif pypcd4

from PIL import Image, ImageFilter
from pillow_heif import HeifImagePlugin

d = Path("wherever")

img = Image.open(d / "test_image.heic")

depth_im = img.info["depth_images"][0]
pil_depth_im = depth_im.to_pillow()
pil_depth_im.save(d / "depth.png")

depth_array = np.asarray(depth_im)
rgb_rescaled = img.resize(depth_array.shape[::-1])
rgb_rescaled.save(d / "rgb.png")

A lovely picture of my face and a depth map of it.

Crazy! I had a play with projecting this into 3D to see what it would look like. I was too lazy to look deeply into how this should be interpreted geometrically, so initially I just pretended the image was taken from infinitely far away and then eyeballed the units. The fact that this looks at all reasonable makes me wonder if the depths are somehow reprojected to match that assumption. Otherwise you'd need to also know the properties of the lense that was used to take the photo.

This handy pypcd4 python library made outputting the data quite easy and three.js has a module for displaying point cloud data. You can see that why writing numpy code I tend to scatter print(f"{array.shape = }, {array.dtype = }") liberally throughout, it just makes keeping track of those arrays so much easier.

from pypcd4 import PointCloud

n, m = np_im.shape
aspect = n / m
x = np.linspace(0,2 * aspect,n)
y = np.linspace(0,2,m)

rgb_points = np.array(rgb_rescaled).reshape(-1, 3)
print(f"{rgb_points.shape = }, {rgb_points.dtype = }")
rgb_packed = PointCloud.encode_rgb(rgb_points).reshape(-1, 1)
print(f"{rgb_packed.shape = }, {rgb_packed.dtype = }")

print(np.min(np_im), np.max(np_im))

mesh = np.array(np.meshgrid(x, y, indexing='ij'))

xy_points = mesh.reshape(2,-1).T
print(f"{xy_points.shape = }")

z = np_im.reshape(-1, 1).astype(np.float64) / 255.0

m = pil_depth_im.info["metadata"]
range = m["d_max"] - m["d_min"]
z = range * z + m["d_min"]

print(f"{xyz_points.shape = }")
xyz_rgb_points = np.concatenate([xy_points, z, rgb_packed], axis = -1)

pc = PointCloud.from_xyzrgb_points(xyz_rgb_points)
pc.save(d / "pointcloud.pcd")

Click and drag to spin me around. It didn't really capture my nose very well, I guess this is more a foreground/background kinda thing.

5.8 KiB Raw Blame History

5.8 KiB

Raw Blame History