Flagging Data to Improve Data Quality

Improve your PhotosynQ projects by getting rid of low quality measurements with the flagging data feature.

 

Whether you are taking measurements in the field or in the lab, it is not uncommon for a handful of measurements to be returned to you with an issue warning. There are a variety of warnings you might receive, red issue warnings suggest that a measurement might be bad, while yellow and blue issue messages simply provide information about the measurement. You may find it is necessary to remove the measurements with red issue warnings from a project before you can properly analyze your data.

WHY

Bad measures can happen for many reasons, maybe someone had shaky hands or the leaf was dead. Or maybe the leaf was not totally covering the light guide, many  of these things can be attended to and fixed if you follow our Best Measurement Practices. It could be our fault, your fault, or maybe natures. All of these will return issue warnings that something went wrong during your measurement. You always have the choice to discard and redo the measurement in the field, however this can be easy to overlook. If you go through your experiment, accept everything and later realize you may have made some mistakes, we allow you to flag this data later in the Data Explorer. Flagging data DOES NOT delete the measurements from the website, but rather hides them. Flagged data can always be viewed by clicking the “Include flagged datasets” box in the Add Series tab, so don’t worry!

FlagDataBlogpic2Including flagged data in your data series

HOW

Flagging your data could not be any easier in the Data Viewer at photosynq.org. All you will have to do is select a single suspect point (1), either from a scatter plot or the table. Look at the data point in the single data view, and see if there were any issue warnings (2). If you determine this point is no good and you’d like to flag it, under the name of the person who submitted the data point, you will see a tab for Issues. Click here, write up your issue in the Reason for Flagging dialog box and click Submit to quickly flag a data point (3). It is important to remember that when you flag a data point you are hiding it from your analysis, NOT deleting it entirely, don’t worry, we never delete your data.

CIRCLEBADBADBAD(1) This measurement’s SPAD (relative chlorophyll content) is very low, we can take a closer look at it by clicking on the data point

BADTRACE

(2) Here we can see that the macro output an issue warning when the measurement was taken. Even if there is no issue warning, you may spot a problem with the measurement by examining the measurement trace. In this case it appears that someone failed to completely cover their entire light guide!

GOODTRACE

For reference, this is a good, average trace for the Leaf Photosynthesis MultispeQ V1.0

owowoww

(3)Flagging data requires a reason, this helps everyone involved be on the same page

The Difference Flagging Makes

When you a flag a data point, you are essentially putting them aside, not to be considered when you run analysis, so it is easy to see how your outcomes might be different depending on if you flagged your data or not.

phi2AVGwflag.png

First, let’s look at data for a project that includes every data point. This is what you might see in your project after a day in the field and no work done to your data yet. You can see for the Phi2 parameter that the standard error is very large, even reaching into the negatives. Since plants can’t have a negative Phi2, it indicates some work on the data needs to be done before we can be confident in the results. If you went through and flagged bad measurements, you will end up with a better graph like the one below.

phi2AVGnoflag

After flagging all of the measurements that either had issue warnings or had obvious problems, by looking at the measurement traces, we get a very different graph. Even after flagging the poor quality measurements we still see large standard error bars. Remember, photosynthetic parameters are highly dependent on ambient conditions such as light intensity, time of day and temperature. We recommend using multivariate analysis to account for these factors, for more tips on data analysis, check out our tutorials here.

Issue warnings can occur on exceptional leaves, such as ones with a very dark deep green colour, and still be fine, so it is important to examine them, rather than just flag whole chunks. Flagging your data where appropriate will make your data tighter and more true to what was observed in your experiment. This is useful when interpreting your results because you are not being bogged down by faulty measurements, which in turn should make statistical analysis of your results much better!

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.