PEBL: Signal Detection Theory
Signal Detection Theory, What is it?
Signal detection theory encompasses the methods of how we detect, process, and compare information that we can record, as well as the algorithms we design to act upon that data. In this project I explore how well I can detect small changes in pictographic density and what factors might affect my ability to do so. Through the testing and analysis of data, I hope to learn more about how signal detection methods are applied to real world situations.
The Experiment
The Testing Interface: PEBL
The PEBL (Psychology Experiment Building Language) software package allows users to design and create psychological experiments. For this project, I ran a set of experiments on myself that were pre-coded and provided to me by my ENP 0163 professor, Dr. Intriligator.
The Test
The expiriment is simple: a set of 80 rectangular images are presented on the screen and the user must choose if there are more stars than dashes, or less stars than dashes. The user input is simplified to two computer keys (left shift and right shift). This test was self-administered on my laptop, completing 6 consecutive runs. One practice run was done to familiarize myself with the process before data was collected.
The Test Parameters
A "low" (A group) density image is an image with an average of 46 stars (standard deviation of 5) and a "high" (B group) density image is an image with an average of 54 starts (standard deviation of 5). To start, the images were distributed 50:50, meaning that half of them will be high density and half will be low density. These are the default settings. To gain a better understanding of how my performance reacted to changes in test parameters, they were changed during the testing. More on that below.
External Parameters
To add an extra dimension to the data set, I chose to do several extra runs that included external factors in the hopes of changing my results. Using my music library, I did additional tests while listening to different types of music: music with no lyrics, music with lyrics I knew, and no music. To add an extra layer of challenge, for the music with lyrics, I chose a high energy, fast paced song. The music was started at the beginning of a test and stopped at the end.
The Data
In total, I took 6 data sets.
Run 1
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User listening to instrumental music
Run 2
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User listening to music with lyrics they know
Run 3
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User not listening to music (quiet room)
Run 4
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 20:60
User not listening to music (quiet room)
Run 5
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 60:20
User not listening to music (quiet room)
Run 6
Low/High Target: 38/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User not listening to music (quiet room)
The output data from these experiments was collected via .csv from the PEBL software and imported into Excel for analysis. Once the data was in Excel, I was able to calculate Z scores for hit rate, false positive rate, correct reject rate, d', and beta value. This allowed me to use the resourse found here to help visualize what each ROC (receiver operating characteristic curve) looks like. Additionally, in Excel, I plotted each ROC on one graph in efforts to help analyze the curves with respect to one another.
Run 1
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User listening to instrumental music
Run 2
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User listening to music with lyrics they know
Run 3
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User not listening to music (quiet room)
Run 4
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 20:60
User not listening to music (quiet room)
Run 5
Low/High Target: 46/50 (5 stdDev)
Signal Distribution (B:A): 60:20
User not listening to music (quiet room)
Run 6
Low/High Target: 38/50 (5 stdDev)
Signal Distribution (B:A): 40:40
User not listening to music (quiet room)
Discussion
There are several things of interest from my set of data. The first thing I noticed in my analysis was that reducing the density of the "low" images in run 6 drastically increased my ability to tell the images apart. With a sensitivity (d') value of 3.4, it completely blows my other results away (next closest d' is 1.05).
The next thing I noticed is that music appeared to help me score better in the test. Each test with music playing, lyrics or no lyrics, appeared to boost my score and ability to correctly identify pictographic density. Of course only a hand full of trials may not be enough to statistically prove this, but it is interesting!
While music and density reduction on the low side improved my scores, I could not find a strong correlation with the change in base rate. At first it appeared that switching from 40:40 to 20:60 drastically reduced my score, but the next test of 60:20 brought my score back up to essentially match the 40:40 score. This could be due to a few reasons: either I had a bad test for the 20:60 score due to human error and variability, or I am much better at identifying when an image is high density rather than low density. In the future, more tests could be done to single out this factor and prove it one way or another.
Finally, I believe my overall ability to discern high from low density to be quite poor. With my lowest score being a 37% hit rate, I should not put pictographic density identification in the skills section on my resume. With that being said, I was able to show that I am capable of improving when the experiment design changes. I wonder if there are other ways to train a user to improve with a specific test without making the test easier? I'd love to try different types of music to see if there is a category or artist that improves my results even more!
Signal Detection in Systems Design
In completing this experiment, I began to think about how these concepts could be applied to a system that's already in place. The first place that came to mind is the possibility of implementing signal detection theory on a manufacturing line. Often times, complex assemblies with non-deterministic outcomes (things that are hard to measure: heat shrink final dimensions, flexible assembly dimensions, glue application evenness, etc.) will list "visual inspection" as a way of confirming or denying a part is suitable for use. This visual inspection is done by a trained line operator whose job it is to look at each part that comes off the line. What if, instead of having just having a human operator, a machine was introduced as a backup to help improve the users hit/miss statistics. A camera system could be added to the manufacturing line that is able to see the part and make judgments on the part based on an AI model. As parts are accepted or rejected by the human user, the AI could be learning what criteria to look for in a good and bad part. As the AI improves, it could start giving secondary opinions in order to aid the user in lowering their false alarm statistics. Eventually, perhaps the human user could be removed entirely? This approach could be added to many places along the manufacturing line where applicable. This would also open the door to better line-wide statistics, showing managers of the line where systematic problems may lie, and how they might improve production from a systemic point of view.