Py4Bio Chapter 2: List

This is the Chapter 2 of Py4Bio.

Lists are an ordered collection of objects.

>>> data = [] # Make an empty list

>>> print(data)
[]

# “append” == “add to the end”
>>> data.append("Hello!") 
>>> print(data)
['Hello!']

# You can put different objects in the same list
>>> data.append(5)
>>> print(data)
['Hello!', 5]

>>> data.append([9, 8, 7])
>>> print(data)
['Hello!', 5, [9, 8, 7]]

# “extend” appends each element of the new list to the old one

>>> data.extend([4, 5, 6])
>>> print(data)
['Hello!', 5, [9, 8, 7], 4, 5, 6]

Lists and strings are similar in terms of index and slice.

But lists are mutable.

Strings are immutable. Lists can be changed.

# Strings are immutable.
>>> s = "ATCG"
>>> print(s)
ATCG
>>> s[1] = "U"
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object doesn't support item assignment
>>> s.reverse()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'str' object has no attribute 'reverse'
>>> print(s[::-1])
GCTA
>>> print(s)
ATCG

Lists can be changed

>>> L = ["adenine", "thymine", "cytosine", "guanine"]
>>> print(L)
['adenine', 'thymine', 'cytosine', 'guanine']
>>> L[1] = "uracil"
>>> print(L)
['adenine', 'uracil', 'cytosine', 'guanine']
>>> L.reverse()
>>> print L
['guanine', 'cytosine', 'uracil', 'adenine']
>>> del(L[0])
>>> print L
['cytosine', 'uracil', 'adenine']

Lists can hold any object.

>>> L = ["", 1, "two", 3.0, ["quatro", "fem", [6j], []]]
>>> len(L)
5
>>> print(L[-1])
['quatro', 'fem', [6j], []]
>>> len(L[-1])
4
>>> print(L[-1][-1])
[]
>>> len(L[-1][-1])
0

A few more methods of lists.

>>> L = ["thymine", "cytosine", "guanine"]
>>> L.insert(0, "adenine")

>>> print(L)
['adenine', 'thymine', 'cytosine', 'guanine']

>>> L.insert(2, "uracil")

>>> print(L)
['adenine', 'thymine', 'uracil', 'cytosine', 'guanine']
>>> print(L[:2])
['adenine', 'thymine']
>>> L[:2] = ["A", "T"]
>>> print L
['A', 'T', 'uracil', 'cytosine', 'guanine']
>>> L[:2] = []      
>>> print(L)
['uracil', 'cytosine', 'guanine']
>>> L[:] = ["A", "T", "C", "G"]
>>> print L
['A', 'T', 'C', 'G']

Turn a string into a list

Complicated way:

>>> s = "AAL532906 aaaatagtcaaatatatcccaattcagtatgcgctgagta"

>>> i = s.find(" ")
>>> print(i)
9

>>> print(s[:i])
AAL532906
>>> print(s[i+1:])
aaaatagtcaaatatatcccaattcagtatgcgctgagta

Easier way:

>>> s = "AAL532906 aaaatagtcaaatatatcccaattcagtatgcgctgagta"

>>> i = s.find(" ")
>>> print(i)
9
>>> print(s[:i])
AAL532906
>>> print(s[i+1:])
aaaatagtcaaatatatcccaattcagtatgcgctgagta

More split examples.

<string>.split() uses ‘whitespace’ to find each word.

>>> protein = "ALA PRO ILU CYS"

>>> residues = protein.split()
>>> print(residues)
['ALA', 'PRO', 'ILU', 'CYS']

>>> protein = " ALA     PRO    ILU CYS  \n"

>>> print(protein.split())
['ALA', 'PRO', 'ILU', 'CYS']

However, you can use other character patterns as well to split the string in list.

>>> print("HIS-GLU-PHE-ASP".split("-"))
['HIS', 'GLU', 'PHE', 'ASP']

Turn a list into a string

<string>.join() is the opposite of split.

>>> L1 = ["Asp", "Gly", "Gln", "Pro", "Val"]
>>> print "-".join(L1)
Asp-Gly-Gln-Pro-Val

>>> print "**".join(L1)
Asp**Gly**Gln**Pro**Val

>>> print("\n".join(L1))
Asp
Gly
Gln
Pro
Val