From the Field to the Laboratory: Using DNA to Show How Different Populations Are Related

As part of an investigation into the relationships of the crag lizards (genus Pseudocordylus), Dr Michael Bates and Mr Edgar Mohapi of the Department of Animal and Plant Systematics conducted several collecting expeditions in South Africa. Tissue samples from the specimens were used in genetic analyses to gain insight into the evolutionary relationships between the different populations. Several years later I was asked to assist with this project by expanding on the genetic analysis.

The genus Pseudocordylus consists of five or six species, though the relationships between some of these are tenuous and there has been debate and uncertainty about which taxa qualify as subspecies or full species. Dr Bates had already performed an initial genetic analysis on the genus and the results looked interesting, but clearly more work was needed to resolve the many complex relationships. After reviewing what had already been obtained, we set about procuring tissue samples for each of the species in the genus, primarily sourced from the National Museum’s collection. In June 2019 I travelled to the University of Pretoria to start working on this conundrum of relatedness.

By sequencing the genes of the lizards of each species we would be able to process these sequences in genetic analyses to visualise how the different lizards are related, and get an idea of which groups are valid species and which not. We suspected that despite the recent studies focussing within the genus the situation might not be quite as straightforward as it seemed.

The actual laboratory work involves a fairly complex series of procedures, and I had just three weeks to work on the samples from 100 lizards I had brought with me. Considering that this scale of work generally forms the bulk of a postgraduate research project, spread out over an entire year, I was quite anxious about being able to complete everything! Over the course of only three weeks I would need to complete the processes of DNA extraction, amplifying particular target genes with this extracted DNA, and preparing the resulting gene products to be sequenced at the University of Pretoria’s DNA Sanger Sequencing Facility.

DNA is present in all animals, plants and bacteria, and can be extracted from a wide variety of materials. In TV crime and drama series we often see CSI-like units extracting DNA from a drop of blood, some saliva, or less probable sources like stray hairs or even fingerprints. Muscle tissue is often used for research on wild animals, although blood, fin clips, hair samples, or even dung can be used. For our lizard samples we used mostly muscle taken from the thigh, liver or tail tips (muscle). A small section of whatever tissue was available was dissected with a scalpel (less than 25 mg, or a 2 mm cube) while taking care to keep the tools uncontaminated from other DNA. From this point on there were a few different options for getting high-quality DNA.

One of the more streamlined methods which we chose to use is the commercial DNA Extraction Kit. Small pieces of tissue were ‘lysed’ – that is, had their cell membranes broken down by the application of enzymes or detergents, and the samples were then kept at a high temperature overnight. Buffers are solutions of chemicals with particular ions and a specific, stable pH. Buffers were added and each sample was centrifuged at very high force (created by spinning at 8000-14000 revolutions per minute). Sodium ions from the buffer neutralised the negative charge of DNA, and this, combined with the addition of ethanol, precipitated DNA out of the solution as it is not alcohol-soluble. DNA was left behind in the spin column, while other tissue components passed through the membrane into the tube below and were discarded (Figure 1). A simple follow-up wash and centrifuge with water resulted in the extraction of the purified DNA from the sample.

Figure 1: Diagram demonstrating how DNA is isolated from the buffer and other cellular debris. DNA in the solution is neutralised with sodium ions in the buffer and eluted out of the solution when ethanol is added and the product is centrifuged at high force. The rest of the solution and cellular debris passes through the membrane (or is stuck above it) while DNA is initially bound to the membrane and then safely removed by centrifuging with water.

The purified DNA samples could now be used to amplify particular genes. We generally require knowledge of the beginning and end of a target gene sequence, usually in a highly-conserved gene region to ensure this part of the gene matches between different individuals or species. We prepare a cocktail of all of the reagents required to produce new DNA through replication into a single tube, add a small amount of the purified DNA from one of our samples, and then add primers (20–30 base pairs of the start and end sequence of our target gene) to the mix. We then put our samples into a thermocycler machine which very accurately raises and lowers temperatures over particular time intervals to attempt to stimulate the purified DNA into replicating itself using all of the additional reagents we have provided. If the primer sequences are not specific enough, the wrong gene may be amplified. And if any other DNA is unintentionally introduced, such as that of the researcher, one could end up amplifying a gene from a different species by mistake!

Dr Stobie working with a box of extracted DNA samples (orange) to prepare amplified DNA products for a mitochondrial gene (purple tray). (Photo: Zoe Yannikarkis)

Once the thermocycling process has been completed and we think that we have obtained the correct gene product, we need to confirm this visually. This generally involves testing a small amount of our final solution on an agarose (a seaweed extract) gel using a process known as electrophoresis, where electrical currents are passed through the gel as it lies submerged in a liquid buffer. DNA fragments travel with the current through the porous gel, with smaller fragments travelling more easily through the gel and travelling further in a given period of time. DNA products are not usually visible so prior to starting the electrophoresis we dye them with a chemical that fluoresces under UV light. A current is then applied between electrodes and after some time the gel is removed and viewed under a UV light. The dye we marked our DNA with fluoresces under UV light and results in a bright band on the gel. This is generally viewed in tandem with a series of DNA fragments of known lengths to estimate the size of the product fragment obtained from the DNA amplification. This means we can compare it to the length we were expecting for any given gene to determine whether it is likely that we have successfully amplified our target gene fragment. If we were unsuccessful in amplifying our fragment we might not see any band at all, or there may be bands of much larger or smaller sizes than expected where other genes or species were amplified instead.

Prepared samples are loaded into a thermocycler to undergo a process known as Polymerase Chain Reaction (PCR) where DNA is stimulated to replicate itself using particular temperature steps and chemicals. Three thermocyclers are present here with the one on the right showing a digital display of the different temperature steps involved. (Photo: Zoe Yannikarkis)

The results of an agarose gel electrophoresis. Each column contains a single amplified gene product visualised under UV light and indicated by a single dark band. There is some variance in the size of these DNA fragments, shown by their relative placement on the gel. This size difference may indicate that different genes have been obtained, though in some cases, as above, this reflects the natural diversity that exists between different species. A large number of mutations between species or genera typically results in some degree of size difference in fragments, although the overall size range here is relatively small.

Visual confirmation of gene amplification is generally followed by preparation for gene sequencing itself. Several different sequencing technologies are available, each with its own set of pros, cons, requirements and limitations. We were fortunate to be allowed to use the University of Pretoria’s DNA Sanger Sequencing Facility for our lizard samples, so we just needed to clean up our DNA products in preparation for the protocol that would be performed on one of the facility’s sequencer machines. This cleaning involved precipitation of the gene product to again remove impurities, visualisation of the product again on an agarose gel electrophoresis as above, and cycle sequencing where we again stimulated the DNA into copying itself, but this time supplied it with specially-marked components so that later on the sequencing machine could read off these special markings sequentially and identify the order in which the bases of the DNA are arranged. This allowed us to identify the exact sequence for each lizard sample.

An example of a small section of aligned mitochondrial sequences for the gene 16S. Most sequences are made up of four different bases – Adenine, Cytosine, Guanine and Thymine – in different combinations. Each horizontal row here represents a different lizard. We can see in the final column that at this particular site two lizards have Cytosine instead of Thymine. This is not surprising as these two lizards belong to a different genus. All the others belong to the genus Pseudocordylus.

By the end of the three-week period I had prepared all of the samples that would amplify for sequencing. But the work continued once back at the National Museum – the sequence data from the sequencing facility still needed to be checked, edited and analysed. Genetic analyses were performed to determine the relatedness between samples. One of the most important of these analyses is the production of a phylogenetic ‘tree’ diagram of genetic relatedness, where different groups of lizards are portrayed as leaves at the ends of branches of a tree of relatedness.

A sample phylogenetic tree based on the mitochondrial gene cytochrome B showing the relatedness of different lineages of KwaZulu-Natal Yellowfish (Labeobarbus natalensis) from various drainage systems (shown in the key). Fish from rivers in the same drainage system share some degree of genetic similarity and cluster together on the tree. There is also a broad split between the southern (Mkomaas, Mbokodweni and Umzimkhulu) and northern (Umgeni, Tugela, Umfolozi) lineages. Support values are not shown but can be important for interpretations of relatedness. The sequence data used to generate this tree are freely available on GenBank (https://www.ncbi.nlm.nih.gov/genbank/) as sequences MH936322-MH936344. Please refer to the original research paper by Stobie et al. (2019) for more information regarding how these sequences were obtained.

The provisional tree that was generated in our main study on Pseudocordylus crag lizards indicated several interesting relationships between different populations and even suggested the possibility of one or more new species! One of the next steps is to investigate whether the populations identified as unique by the genetic analysis differ also in terms of their morphology. If they do, this will be further support for describing them as new species.

Related Posts

From Sweden to the Free State: The First Raadsaal Museum’s remarkable 1760 Cannon

Cabinets of Curiosities 2026

The bird call legacy — from wax cylinders to smartphone applications