Digital Archaeology in Nova Scotia

Methods and Applications from Nova Scotia


Project maintained by Wesley Weatherbee Hosted on GitHub Pages — Theme by mattgraham

A quarter and a camera ─ Measuring relative abundance of artifacts with computer vision in Python

Previously, I showed how to pull some extra quality out of photogrammetric models.

This post is going to take a short dive into the world of computer vision. You’ll learn how to deploy a Python script using OpenCV to take measurements of many artifacts in a single photo. We will modify a script originally developed by Adrian Rosebrock at PyImageSearch to measure objects in an image. The modified script additionally calculates 2D surface area for each artifact and exports the measurements to a CSV file.

Download the script here.

Why use surface area?

Artifact surface area can be used to give a measure of relative abundance of artifact types in an analysis that is less susceptible to error from fragmentation than a simple count. Weight, although a valient contender, can run into issues when comparing between artifacts made of materials with different densities. As good as it sounds, measuring the surface area of irregular objects is a task barely worth the effort using analogue methods.

The first encounter I had with surface area being used in artifact analysis was in a 1997 article by John Byrd and Dalford Owens. Byrd and Owens use what they term Effective Area as a substitute for surface area.

Though the measurement of Effective Area is one measure of surface area. The technique actually has the potential to be quite desctrutive as it “drops the sherd through a set of nested screens” similar to screening soils for grain size analysis.

Byrd and Owens focus their attention specifically on sherds. However, the technique could theoretically be applied to any assemblage of artifacts. Given the right research question.

It’s easy to see why this technique hasn’t been applied to a whole slew of artifacts as it has the potential of destroying or damaging fragmented remains. The intention of Byrd and Owens wasn’t to provide the best solution, but to present a solution that gives a better measurement of relative abundance than a simple counts and in some cases, weights. They explain the reasons they believe surface area hadn’t been adopted:


"Measuring surface area ... will not likely be adopted by archaeologists unless a relatively simple technique is developed for obtaining such data."

Byrd & Owens, 1997


Well I’m happy to say thanks to Adrian at PyImageSearch for creating his blog that allowed me to begin to adopt computer vision and machine learning in python for my own archaeological uses. The following tutorial brings us one step closer to the simple technique alluded to by Byrd and Owens using only a quarter and a camera!

Computer vision?

Computer vision is an interdisciplinary field of research devoted to “extract[ing] meaning from pixels” (Singh, 2019).

I won’t spend too much time on the details of what else computer vision can do here. You can learn more about that at PyImageSearch. I will give a run down on what this particular script is doing, and how you can use it though!

I’m assuming you already have Python 3 and OpenCV installed, but if you don’t please take the time to do so now. OpenCV now installs with pip so you can install it when installing the other required modules.

The script

Okay, so here we go. You can either read along and copy each code block into your favorite text editor saving the file with a .py extension, or you can download the source code here.

File directory

The source code comes as a .zip folder containing the artifact-area.py file and two subdirectories called images and csv.

The subfolder images is where you will place the image you would like to analyze, and csv is where the .csv file is saved to after measuring.

Inside images you will find a single image named test_01.jpg. This is the image used in the tutorial below. The image contains a Canadian quarter and a series of lithic flakes produced preparing the biface for the 3D model I knapped, shown in a previous post.

Tutorial image.

A few things worthy of a note about the photo:

The background heavily contrasts the colour of the material being measured.

The lighting is not great.

The quarter and its placement to the far left of the image. The quarter is used as a scale bar to convert one pixel to a measurement. Canadian quarters have a known diameter of 2.381cm.

The 90 degree angle of the photo. I know it is second nature for artifact photography to take the photo at 90 degrees. Here it is critical, as it impacts the accuracy of the resulting measurements. And I estimated. This definitely impacted the results, but the resulting accuracy is pretty good. A quick way to get your photos to 90 degrees without a tripod is to use a hot shoe bubble level.

You likely can’t notice, but I had to turn the photo quality on my Canon EOS REBEL T5i to the lowest setting to obtain photos without heavy noise during processing. It seems counter-intuitive to take low quality photos, but the extra detail at the distance I was created a lot of false objects in the results. This does raise the possibility of shooting high resolution images from a higher vantage point to measure many more artifacts in a single image.

Script usage and required modules

"""
 Title: Automated Rapid Artifact Surface Area Measurement from Imagery using Computer Vision
 Author: Wesley Weatherbee
 Date: February 2020
 Description:This script is intended to rapidly collect measurments 
             from multiple artifacts in a simgle image using computer vision.
 
 Modified from: https://www.pyimagesearch.com/2016/03/28/measuring-size-of-objects-in-an-image-with-opencv/
 Original Author: Adrian Rosebrock
 Date: March 28, 2016 
"""

# USAGE
# open command-line (cmd.exe) and change the directory to the 
# directory where the file is loaded using: cd [path to directory here]
# after changing the directory, enter the following code in command-line:
# 
# python artifact-area.py --image images/test_01.jpg --width 2.381

# import the necessary packages
from scipy.spatial import distance as dist
from imutils import perspective
from imutils import contours
import numpy as np
import argparse
import imutils
import cv2
import csv
import pandas as pd

The above codeblock imports the required modules to run the script. The script is run from the command-line.

  1. Change the current directory to the artifact-area folder using cd followed by the path.

  2. Use the code python artifact-area.py --image images/test_01.jpg --width 2.381 to run the script. --image is the argument to choose which image to use, and --width is the width of your reference object to the far left of the photo in whatever units you choose.

Define methods used in the script

# define the automatic canny edge detection method
def auto_canny(image, sigma=0.33):
	# compute the median of the single channel pixel intensities
	v = np.median(image)

	# apply automatic Canny edge detection using the computed median
	lower = int(max(0, (1.0 - sigma) * v))
	upper = int(min(255, (1.0 + sigma) * v))
	edged = cv2.Canny(image, lower, upper)

	# return the edged image
	return edged

# define midpoint method
def midpoint(ptA, ptB):
	return ((ptA[0] + ptB[0]) * 0.5, (ptA[1] + ptB[1]) * 0.5)

This script uses two custom methods defined above: automatic canny edge detection def auto_canny (from PyImageSearch), and a method that calculates the midpoint between two sets of coordinates, def midpoint. The second method enables us to measure the X, and Y dimensions of the minimum bounding box of each object. This is different than length and width because they don’t follow the axis running from striking platform to termination of the flake. Still they are important metrics for checking the accuracy of your measurements using the quarter.

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the input image")
ap.add_argument("-w", "--width", type=float, required=True,
	help="width of the left-most object in the image (in inches)")
args = vars(ap.parse_args())

The argument parser is now constructed and reads the --image and --width arguments defined in the command-line.

Loading and preprocessing the image

Now we use OpenCV to read the image defined in the command-line argument. We convert it to grayscale then add a gaussian blur twice. First with a 9x9 kernel, then a 5x5 kernel.

# load the image, convert it to grayscale, and blur it slightly, twice
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (9, 9), 0)
gray = cv2.GaussianBlur(gray, (5, 5), 0)

# perform edge detection, then perform a dilation + erosion to
# close gaps in between object edges
edged = auto_canny(gray)
edged = cv2.dilate(edged, None, iterations=3)
edged = cv2.erode(edged, None, iterations=2)

The auto_canny method is deployed on the blurred grayscale image. Dialation and erosion are run on the resulting edge map to close the boundaries.

Finding and sorting the contours

Here closed boundary contours are identified in the image from left to right in the image. We define the contours as the variable cnt and then sort them from left to right.

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)

# sort the contours from left-to-right and initialize the
# 'pixels per metric' calibration variable
(cnts, _) = contours.sort_contours(cnts)
pixelsPerMetric = None

Lastly, we create a list object called measure to hold the values measured from each contour in the image.

# create list object
measure = []

Lets visualize what we’ve done so far.

Visualization of the process up to this point.

Looping through the contours in the image

The following for loop analyzes each of the contours individually. The first step is to ignore contours with an area less than 10000 pixels.

# loop over the contours individually
for c in cnts:
	# if the contour is not sufficiently large, ignore it
	if cv2.contourArea(c) < 10000:
		continue

The next codeblock creates a properly rotated bounding box around each contour, and X and Y measurements in pixels.

# compute the rotated bounding box of the contour
orig = image.copy()
box = cv2.minAreaRect(c)
box = cv2.cv.BoxPoints(box) if imutils.is_cv2() else cv2.boxPoints(box)
box = np.array(box, dtype="int")

# order the points in the contour such that they appear
# in top-left, top-right, bottom-right, and bottom-left
# order.
box = perspective.order_points(box)

# unpack the ordered bounding box, then compute the midpoint
# between the top-left and top-right coordinates, followed by
# the midpoint between bottom-left and bottom-right coordinates
(tl, tr, br, bl) = box
(tltrX, tltrY) = midpoint(tl, tr)
(blbrX, blbrY) = midpoint(bl, br)

# compute the midpoint between the top-left and top-right points,
# followed by the midpoint between the top-right and bottom-right
(tlblX, tlblY) = midpoint(tl, bl)
(trbrX, trbrY) = midpoint(tr, br)

# compute the Euclidean distance between the midpoints
dA = dist.euclidean((tltrX, tltrY), (blbrX, blbrY))
dB = dist.euclidean((tlblX, tlblY), (trbrX, trbrY))

Below we communicate that if pixelsPerMetric doesn’t yet have a value, then the computer must be reading the first contour. The first contour will be the object farthest to the left of the image. The X dimension of this contour in pixels divided by the --width argument provides the value of pixelsPerMetric. This is why it is integral to have the reference to the left in these images.

# if the pixels per metric has not been initialized, then
# compute it as the ratio of pixels to supplied metric
# (in this case, inches)
if pixelsPerMetric is None:
	pixelsPerMetric = dB / args["width"]

Now we calculate our measurements using our (possibly) newly defined conversion factor of pixelsPerMetric.

# compute the size and surface area of the object
dimA = (dA / pixelsPerMetric)
dimB = (dB / pixelsPerMetric)
SA = ((cv2.contourArea(c) / pixelsPerMetric) / pixelsPerMetric)

And, lastly, the measurements are appended to the python list object measure one by one.

# add measurements to list
measure.append((dimA, dimB, SA))

Visualizing the result

To visualize the result, we first calculate the centre of the contour to anchor our ovaying text from. Next, cv2.drawContours draws the contours in red (0, 0, 255), the surface area measurement is displayed in square centimetres with cv2.putText, the image is resized to fit within a bordered window fully with cv2.resize, and the image is shown with cv2.imshow.

# compute centre of the contour
M = cv2.moments(c)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])

# draw contours in red
cv2.drawContours(orig, [c.astype("int")], -1, (0, 0, 255), 2)

# draw the object area on the image
cv2.putText(orig, "{:.2f}sqcm".format(SA),
	(int (cX), int (cY)), cv2.FONT_HERSHEY_SIMPLEX,
		0.65, (0, 0, 0), 3)
cv2.putText(orig, "{:.2f}sqcm".format(SA),
	(int (cX), int (cY)), cv2.FONT_HERSHEY_SIMPLEX,
		0.65, (255, 255, 255), 2)

# show the output image
origS = cv2.resize(orig, (1536, 1024))
cv2.imshow("Image", origS)
cv2.waitKey(0)

Surface area of each flake visualized.

Accuracy check

One huge benefit to this method aside from the obvious non-destructive nature, scalability, and reproducability is the opportunity to report the accuracy of your results!

You’ll notice in the above image that the surface area of the quarter is 4.56sqcm. We can use this value to calculate the difference between the expected surface area for a circle with a diameter of 2.381cm and the experimental result. Using the equation A = πr2, the expected area is 4.45sqcm, while our experimental result is 4.56sqcm.


We achieved a difference of 0.11cm2 between expected and experimental results.


A difference of 0.11cm2 equates to a 2.41% error in our measurements using the following formula:


PercentError = ((Experimental - Expected) / 100) / Expected


That 2.41% error can be inversely expressed as 97.59% accuracy!

I think for not shooting the photo at exactly 90-degrees to the objects, and not doing a camera calibration to correct for distortion in the lens, we have achieved fairly exciting results!

Of course, you can always improve the results by correcting for lens distortion and angle of capture.

Saving the data to csv

After the for loop runs its course, the following codeblock exports the information stored in the measurements list object to a csv file named Measurements.csv in the subdirectory csv.

# take X, Y, and surface area measurements from the list and append them to a CSV
col_titles = ('X', 'Y', 'SA')
data = pd.np.array(measure).reshape((len(measure) // 1, 3))
pd.DataFrame(data, columns=col_titles).to_csv("csv/Measurements.csv", index=False)

Just like that we are now able to use a quarter and a camera, Python, and OpenCV to non-destructively derive a measure of relative abundance of artifacts using surface area. Not only that, but the ability of the script to measure many more artifacts than we did from a single image is very enticing.

The results are in...

…and we likely all agree that this method provides an effective, non-destructive alternative to the technique proposed by Byrd and Owens for measuring relative abundance in artifact types from surface area.

It’s important to note that this application is just the tip of the iceberg. Applications of python and computer vision to archaeological workflows are numerous, yet infrequently utilized. However, I hope that as I continue to share information like the above tutorial, we will see a rise in computer literacy within archaeological communities. Strategically investing in computer literacy will inevitably free up time to do more archaeology. Whatever that means to you.

Maybe these newfangled computational methods actually do provide useful tools for archaeologists? I’d love to see archaeologists be able to do more with less funding and time. I believe it is valuable to invest in computer literacy to achieve this.

Next post I’ll be showing a useful script I wrote to help estimate the resolution of ground penetrating radar data in different soil conditions. As usual, if you have any comments don’t hesitate to send them along on the social media post or through email on the about page.

Download the script here!

Wesley Weatherbee

References:

Byrd, J. E., & Owens, D. D. (1997). A Method for Measuring Relative Abundance of Fragmented Archaeological Ceramics. Journal of Field Archaeology, 24(3), 315–320. https://doi.org/10.1179/009346997792208168
Rosebrock, A. (2015, April 6). Zero-parameter, automatic Canny edge detection with Python and OpenCV. PyImageSearch. https://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/
Rosebrock, A. (2016, March 28). Measuring size of objects in an image with OpenCV. PyImageSearch. https://www.pyimagesearch.com/2016/03/28/measuring-size-of-objects-in-an-image-with-opencv/
Singh, R. (2019, June 10). Computer Vision—An Introduction. Medium. https://towardsdatascience.com/computer-vision-an-introduction-bbc81743a2f7
Written on February 18, 2020

Creative Commons License
DigitalArchNS by Wesley Weatherbee is licensed under a Creative Commons Attribution 4.0 International License.