Modules:
compute_initialpqtree module
Copyright © Bonsai - LIFL (Université Lille 1, CNRS UMR 8022) and Inria-Lille Nord Europe
contact: aida.ouangraoua@inria.fr, amandine.perrin@inria.fr
This software is a computer program whose purpose is to progressively reconstruct ancestral
gene orders.
This software is governed by the CeCILL-B license under French law and
abiding by the rules of distribution of free software. You can use,
modify and/or redistribute the software under the terms of the CeCILL-B
license as circulated by CEA, CNRS and Inria at the following URL
http://www.cecill.info, or in the LICENCE file at the root directory of this program.
The fact that you are presently reading this means that you have had
knowledge of the CeCILL-B license and that you accept its terms.
compute_initialpqtree module description:
This module reads the source files (tree file and blocks file), stores information,
and fills a CAR file with an initial set of CARs, each composed of a single block.
It needs, as input :
- a species tree where a node is marked as the ancestor: @
- a set of universal synteny blocks
- a CAR file name (in which initial PQtree will be saved)
Module author: Aïda Ouangraoua, Amandine PERRIN
February 2014
-
procars.step_modules.compute_initialpqtree.commons_read_blocks(block_file_name, nb_species, spe_ids, binary_file_name)[source]
Function reading blocks from a block file and computing block neighbors. Saves all
information about blocks into a binary file, used in the next steps
Parameters: | block_file_name : String
Name of file containing the ordered blocks for all species
nb_species : int
spe_ids : dict
Species and their corresponding ID {spe1: id1, spe2: id2, ...}
binary_file_name : String
Name of the file in which saving all information about the blocks
|
Returns: | int
|
-
procars.step_modules.compute_initialpqtree.commons_read_tree(tree_file, binary_file_name)[source]
Reading, and computing parameters and partitions from a tree file, and saving values computed
in a binary file.
Parameters: | tree_file_name : String
Name of file containing species tree
binary_file_name : String
Name of file where results must be stored
|
Returns: | list
nb_species: int, total number of species
spe_ids: dict, {species1: id1, species2: id2, ...}
|
-
procars.step_modules.compute_initialpqtree.compute_blocks_neighbors(nb_blocks, nb_species, blocks)[source]
Function computing block neighbors
Parameters: | nb_blocks : int
nb_species : int
blocks : list
List of lists
blocks = [[list of blocks for spe_id=0], [list of blocks for spe_id=1], ...]
with list of blocks for spe_id=1 =
[[list of blocks for spe_id=1, chromosome 1],
[list of blocks for spe_id=1, chromosome 2],
...]
with list of blocks for spe_id=1, chromosome 1 =
[[start, end, num_bloc, orient],
[start, end, num_bloc, orient],
... for the n blocks]
|
Returns: | tuple
left : array such that left[spe_id][block_id] is an array containing the left
neighbor of block block_id in spe_id
right : array such that right[spe_id][block_id] is an array containing the right
neighbor of block_id in spe_id
|
-
procars.step_modules.compute_initialpqtree.compute_parameters_from_tree(tree_file_name)[source]
Function computing parameters from a tree file
Parameters: | tree_file_name : String
Name of file containing the species tree to read to extract parameters
|
Returns: | tuple
tree: dict, tree structure
ancestor_id: int, id of ancestor node in tree
path: list, array of 0 and 1 indicating nodes that are ancestor of ancestor node
leaves: list, array of integers containing a list of leave ids
internals: list, integer array containing an ordered list of internal nodes ids in a
postfix order with ancestor node as root
|
-
procars.step_modules.compute_initialpqtree.compute_partition_and_species_id(tree, ancestor_id)[source]
Function calling the method partitionning the tree into 3 groups defined by the
ancestor, and determining the 3 groups of species IDs, as well as the list of species with
their corresponding ID.
Parameters: | tree : dict
Tree structure :
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
Where id = -1 if no parent/left_child/right_child
And node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
ancestor_id : int
ID of ancestor node in tree
|
Returns: | tuple
species1_id: integer array, ids of group1 species
species2_id: integer array, ids of group2 species
species3_id: integer array, ids of group3 species
spe_ids: map such that spe_ids[spe_name] = spe_id
|
-
procars.step_modules.compute_initialpqtree.list_leaves_internals(tree, root_id, path)[source]
Function computing a list of leaves and a postfix ordered list of internal nodes for a tree
Recursive algorithm visiting nodes from ancestor node to leaves, in a prefix order.
Parameters: | tree : dict
Tree structure :
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
where id = -1 if no parent/left_child/right_child
and node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
root_id : int
The node considered as the new root of the tree
path : list
0-1 array such that path[node_id] = 1 if node is on the path between ancestor node and root,
and otherwise path[node_id]=0
|
Returns: | list
leaves_internals: list containing two lists: a list of leaves ids and an ordered list of
internal node ids
|
-
procars.step_modules.compute_initialpqtree.main(tree_file_name, block_file_name, car_file_name, tree_bin, blocks_bin, adj_file)[source]
Main method.
Reads tree file and block file, stores information into binary files, and creates a
CAR file with an initial set of CARs, each composed of a single block. Also creates an
empty adjacency file (no adjacency added at step 0).
Parameters: | tree_file_name : string
Name of the species tree to read
block_file_name : string
Name of the blocks file to read
car_file_name : string
Name of PQtree file in which saving the initial set of CARs
tree_bin : string
Name of binary file in which saving all information found in tree file
blocks_bin : string
Name of binary file in which saving all information found in blocks file
adj_file : string
Name of adjacency file in which saving adjacencies added at step0
(= no adjacency -> empty file)
|
Returns: | int
|
-
procars.step_modules.compute_initialpqtree.mark_ancestors(tree, ancestor_id)[source]
Method marking tree nodes which are on the path between the ancestor node and the root node
Parameters: | tree : dict
Tree structure :
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
where id = -1 if no parent/left_child/right_child
and node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
ancestor_id : int
ID of ancestor node in tree
|
Returns: | list
0-1 array such that path[node_id] = 1 if node is on the path between ancestor node and root,
and otherwise path[node_id] = 0
|
-
procars.step_modules.compute_initialpqtree.partition_species(species1, species2, species3, tree, ancestor_id, cur_id, partition_id)[source]
Function computing the partition of species defined by the ancestor node
Recursive algorithm
Parameters: | species1 : list
Species names in group 1 array to be filled
species2 : list
Species names in group 2 array to be filled
species3 : list
Species names in group 3 array to be filled
tree : dict
Tree structure :
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
Where id = -1 if no parent/left_child/right_child
And node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
ancestor_id : int
ID of ancestor node in tree
cur_id : int
ID of current node during the tree transversal
partition_id : int
ID of current group in partition
|
Returns: | list
species1: String array, names of group1 species
species2: String array, names of group2 species
species3: String array, names of group3 species
|
-
procars.step_modules.compute_initialpqtree.write_init_cars_file(car_file_name, nb_blocks)[source]
Creates a CAR file, with an initial set of CARs, each composed of a single block
Parameters: | car_file_name : String
File in which saving the CARs
nb_blocks : int
Total number of blocks = number of CARs in the output file
|
compute_conserved_adjacencies module
Copyright © Bonsai - LIFL (Université Lille 1, CNRS UMR 8022) and Inria-Lille Nord Europe
contact: aida.ouangraoua@inria.fr, amandine.perrin@inria.fr
This software is a computer program whose purpose is to progressively reconstruct ancestral
gene orders.
This software is governed by the CeCILL-B license under French law and
abiding by the rules of distribution of free software. You can use,
modify and/or redistribute the software under the terms of the CeCILL-B
license as circulated by CEA, CNRS and Inria at the following URL
http://www.cecill.info, or in the LICENCE file at the root directory of this program.
The fact that you are presently reading this means that you have had
knowledge of the CeCILL-B license and that you accept its terms.
compute_conserved_adjacencies module description:
From a CAR file, this module computes a set of conserved adjacencies between cars
for the ancestor node, and it splits this set into :
- a set of fully conserved adjacencies and a set of partly conserved adjacencies first
- among the two sets, a set of non-conflicting adjacencies and a set of conflicting adjacencies
Module author: Aïda Ouangraoua, Amandine PERRIN
February 2014
-
procars.step_modules.compute_conserved_adjacencies.check_potential_cars_neighbors(spe_id, neighbors, segments_neighbors, cars, block_position_in_car, left, right)[source]
For each pair of potential neighboring CARs, check if the pair of segments
really supports the adjacency
Parameters: | spe_id : int
neighbors : list
List of potential neighbors (with CAR ids). Ex: [[], [1, 2], [2, 3], ...]
segments_neighbors : list
List of potential segment neighbors. Ex. [[], [[1], [2]], [[2], [3]], ...]
cars : list
block_position_in_car : list
Integer array such that block_position_in_car[block_id] = position of block
in car to which it belongs
left : list
Left neighbors of species for all blocks
right : list
Left neighbors of species for all blocks
|
Returns: | tuple
left and right valid neighbors
|
-
procars.step_modules.compute_conserved_adjacencies.commons_read_cars(nb_blocks, blocks, nb_species, car_file_name, step_nb, binary_file_name)[source]
Function reading blocks from a block file, reading car file, and computing block neighbors
Parameters: | nb_blocks : int
blocks : list
array of arrays (blocks for each species)
blocks = [[list of blocks for spe_id=0], [list of blocks for spe_id=1], ...]
with list of blocks for spe_id=1 =
[[list of blocks for spe_id=1, chromosome 1],
[list of blocks for spe_id=1, chromosome 2],
...]
with list of blocks for spe_id=1, chromosome 1 =
[[start, end, num_bloc, orient],
[start, end, num_bloc, orient],
... for the n blocks]
nb_species : int
car_file_name : String
Name of the PQtree file where current CARs are stored
step_nb : int
Current step of the ProCars method
binary_file_name : String
Name of the binary file where information about current cars must be saved
|
Returns: | tuple
cars: array of arrays (cars)
nb_cars: integer, total number of cars
block_position_in_car: integer array
blocks_car: array of arrays (blocks’ cars): see return of compute_blocks_cars
|
-
procars.step_modules.compute_conserved_adjacencies.compute_adjacencymatrix_from_neighbor_arrays(left, right, nb_blocks)[source]
Function computing an adjacency matrix from left and right neighbors
Parameters: | left : list
Array such that left[block_id] is an array containing all left neighbors of block block_id
right : list
Array such that right[block_id] is an array containing all right neighbors of block_id
nb_blocks : int
|
Returns: | SparseMatrix
- adjacencymatrix: 0-1 Matrix which rows and columns correspond to block extremities with:
- right extremity of bloc_id : extremity_id = block_id
- left extremity of bloc_id : extremity_id = block_id + nb_blocks
|
-
procars.step_modules.compute_conserved_adjacencies.compute_blocks_cars(nb_species, blocks, block_to_car)[source]
Computes the list of blocks’ cars for each species
Parameters: | nb_species : int
blocks : list
array of arrays (blocks for each species)
blocks = [[list of blocks for spe_id=0], [list of blocks for spe_id=1], ...]
with list of blocks for spe_id=1 =
[[list of blocks for spe_id=1, chromosome 1],
[list of blocks for spe_id=1, chromosome 2],
...]
with list of blocks for spe_id=1, chromosome 1 =
[[start, end, num_bloc, orient],
[start, end, num_bloc, orient],
... for the n blocks]
block_to_car : list
Integer array such that block_2_car[block_id] = car_id to which block_id belongs
|
Returns: | list
array of arrays (blocks’ cars)
blocks_car = [[list of block_cars for spe_id=0], [list of block_cars for spe_id=1], ...]
with list of block_cars for spe_id=1 =
[[list of blocks’ cars for spe_id=1, chromosome 1],
[list of blocks’ cars for spe_id=1, chromosome 2],
...]
|
-
procars.step_modules.compute_conserved_adjacencies.compute_cars_neighbors(nb_cars, nb_species, blocks, cars, blocks_car, block_position_in_car, binary_file_name)[source]
Finds all possible neighbors for each CAR
Parameters: | nb_cars : int
nb_species : int
blocks : list
All blocks, for each species: array of arrays (blocks for each species)
blocks = [[list of blocks for spe_id=0], [list of blocks for spe_id=1], ...]
with list of blocks for spe_id=1 =
[[list of blocks for spe_id=1, chromosome 1],
[list of blocks for spe_id=1, chromosome 2],
...]
with list of blocks for spe_id=1, chromosome 1 =
[[start, end, num_bloc, orient],
[start, end, num_bloc, orient],
... for the n blocks]
cars : list
List of cars with signed blocks in each one
blocks_cars : list
blocks_car = [[list of block_cars for spe_id=0], [list of block_cars for spe_id=1], ...]
with list of block_cars for spe_id=1 =
[[list of blocks’ cars for spe_id=1, chromosome 1],
[list of blocks’ cars for spe_id=1, chromosome 2],
...]
block_position_in_car : list
Integer array such that block_position_in_car[block_id] = position of block
in car to which it belongs
|
Returns: | list
left car neighbors and right car neighbors
such that left[spe_id][car_id] = [list of left neighbors of car_id in species spe_id]
|
-
procars.step_modules.compute_conserved_adjacencies.compute_conserved_adjacencies(left, right, nb_blocks, species1_id, species2_id, species3_id)[source]
Procedure computing the conservation status of all potential adjacencies
Fills arrays for fully conserved adjacencies (conserved_left/right_D) and partly conserved
adjacencies (conserved_left/right)
Parameters: | left : list
Array such that left[block_id] is an array containing all left neighbors of block block_id
right : list
Array such that right[block_id] is an array containing all right neighbors of block_id
nb_blocks : int
species1_id: list
species2_id: list
species3_id: list
|
Returns: | list
conserved_left_D: array of left fully conserved adjacencies
conserved_right_D: array of right fully conserved adjacencies
conserved_left: array of left conserved adjacencies
conserved_right: array of right conserved adjacencies
|
-
procars.step_modules.compute_conserved_adjacencies.compute_extremity(segment, car_id, cars, block_position_in_car, orientation)[source]
Function computing the orientation of a CAR in a CAR adjacency.
Parameters: | segment : list
car_id : int
cars : list
block_position_in_car : list
Integer array such that block_position_in_car[block_id] = position of block
in car to which it belongs
orientation : int
1 if positive orientation, -1 otherwise
|
Returns: | int
extremity: oriented car, or 0 if segment is not an extremity of car
|
-
procars.step_modules.compute_conserved_adjacencies.compute_potential_cars_neighbors(spe_id, blocks, blocks_car)[source]
Compute potential pairs of neighboring cars
Parameters: | spe_id : int
blocks : list
Positions of blocks for each species (see commons_read_cars)
blocks_car : list
Blocks’ CARs for each species. See return of compute_blocks_cars
|
Returns: | tuple
neighbors: pair of cars having consecutive segments in species
segments_neighbors: composition of corresponding segments (in terms of signed blocks)
|
-
procars.step_modules.compute_conserved_adjacencies.diff_nc_matrix_m(matrix1, matrix2, nb_blocks)[source]
Function splitting an adjacency matrix matrix2 into two matrices : the
non-conflicting and the conflicting adjacencies with an other matrix matrix1
Parameters: | matrix1 : SparseMatrix
0-1 Matrix, first adjacency matrix
matrix2 : SparseMatrix
0-1 Matrix, second adjacency matrix
nb_blocks : int
|
Returns: | tuple
nc_adjacencymatrix: SparseMatrix containing non-conflicting adj.
c_adjacencymatrix: SparseMatrix containing conflicting adj.
|
-
procars.step_modules.compute_conserved_adjacencies.fill_conserved_arrays(conserved_left_D, conserved_right_D, conserved_left, conserved_right, left, right, species1_id, species2_id, species3_id, block1, block2)[source]
Method filling the conservation status arrays for a given adjacency block1, block2
Parameters: | conserved_left_D : list
Array of left fully conserved adjacencies to be filled
conserved_right_D : list
Array of right fully conserved adjacencies to be filled
conserved_left : list
Array of left conserved adjacencies to be filled
conserved_right : list
Array of right conserved adjacencies to be filled
left : list
Array such that left[block_id] is an array containing all left neighbors of block block_id
right : list
Array such that right[block_id] is an array containing all right neighbors of block_id
species1_id: list
species2_id: list
species3_id: list
block1: int
First block id of the given adjacency
block2: int
Second block id of the given adjacency
|
-
procars.step_modules.compute_conserved_adjacencies.find_adjacency_sets(nb_cars, nb_species, blocks, cars, blocks_car, block_position_in_car, car_neigh_bin, species1_id, species2_id, species3_id)[source]
Finds all sets of adjacencies, according to their status : (non)conflicting,
fully/partly conserved, etc.
Parameters: | nb_cars : int
nb_species : int
blocks : list
All blocks, for each species: array of arrays (blocks for each species)
blocks = [[list of blocks for spe_id=0], [list of blocks for spe_id=1], ...]
with list of blocks for spe_id=1 =
[[list of blocks for spe_id=1, chromosome 1],
[list of blocks for spe_id=1, chromosome 2],
...]
with list of blocks for spe_id=1, chromosome 1 =
[[start, end, num_bloc, orient],
[start, end, num_bloc, orient],
... for the n blocks]
cars : list
For each car id, ordered list of signed blocks in the car
blocks_car : list
[[list of block_cars for spe_id = 0], [list of block_cars for spe_id = 1], ...]
with list of block_cars for spe_id = 1 =
[[list of blocks’ cars for spe_id=1, chromosome 1],
[list of blocks’ cars for spe_id=1, chromosome 2],
...]
block_position_in_car : list
Integer array such that block_position_in_car[block_id] = position of block in car
to which it belongs
car_neigh_bin : string
Name of the binary file where car neighbors information is stored
species1_id : list
species2_id : list
species3_id : list
|
Returns: | tuple
car_left: left neighbors of each car, such that left[spe_id][car_id] =
[list of left neighbors of car_id in species spe_id]
car_right: same as car_left, for right neighbors
fs_nc_adj: fully conserved non-conflicting adjacencies
ps_nc2_adj: partly conserved non-conflicting adjacencies
fs_c_adj: fully conserved conflicting adjacencies
ps_nc_c_adj: partly conserved adjacencies in conflict with fully conserved ones
ps_c_adj: cnflicting partly conserved adjacencies
|
-
procars.step_modules.compute_conserved_adjacencies.find_info_and_write_adjs(group1_adjs, group2_adjs, car_left, car_right, cars, step_nb, species1_id, species2_id, species3_id, file_handler)[source]
Find the conservation status of all retained adjacencies, and write them
into the output file
Parameters: | group1_adjs : list
List of car_adjacencies fully conserved
group2_adjs : list
List of car_adjacencies partly conserved
car_left : list
For each car, left neighbor
car_right : list
For each car, right neighbor
cars : list
For each car id, ordered list of signed blocks in the car
step_nb : int
Current step of the ProCars method
species1_id : list
species2_id : list
species3_id : list
file_handler : fileObject
Open file in which writing the given adjacencies and their information
|
-
procars.step_modules.compute_conserved_adjacencies.main(car_file_name, adjacency_file_name_prefix, step_nb, tree_bin, blocks_bin, car_bin, car_neigh_bin)[source]
Main method
From a CAR file, it computes a set of conserved adjacencies between CARs
for the ancestor node, and it splits this set into a set of non-conflicting
adjacencies and a set of conflicting adjacencies.
Parameters: | car_file_name : string
Prefix of the PQtree files
adjacency_file_name_prefix : string
Prefix of the Adjacencies files
step_nb : int
Current step of the ProCARs method
tree_bin : string
Name of the binary file where information about the species tree is stored
blocks_bin : string
Name of the binary file where information about blocks is stored
car_bin : string
Name of the binary file where information about current cars is stored
car_neigh_bin : string
Name of the binary file where information about car neighbors is stored
|
Returns: | tuple
nb_final_adj: total number of adjacencies added
nb_discarded: total number of discarded adjacencies
fs_nc_adj: fully conserved non-conflicting adjacencies
ps_nc2_adj: partly conserved non-conflicting adjacencies
fs_c_adj: conflicting fully conserved adjacencies
ps_nc_c_adj: partly conserved adjacencies in conflict with fully conserved ones
ps_c_adj: conflicting partly conserved adjacencies
|
-
procars.step_modules.compute_conserved_adjacencies.reverse(segment)[source]
Function producing the reverse of a segment
Parameters: | segment : list
|
Returns: | list
segment_rev: reverse of given segment
|
Examples
For example:
reverse([1 -2 3]) = [-3 2 -1]
-
procars.step_modules.compute_conserved_adjacencies.segment_car_common_extremity(segment, car, block_position_in_car, side_segment, side_car)[source]
Function checking if a segment and a car share a same extremity region in terms of synteny.
Parameters: | segment : list
car : list
Ordered list of signed blocks in the given CAR
block_position_in_car : list
Integer array such that block_position_in_car[block_id] = position of block
in car to which it belongs
side_segment : int
1 if we look at right end of the segment, 0 otherwise
side_car : int
1 if we look at right end of the CAR, 0 otherwise
|
Returns: | boolean
True if it shares an extremity, False otherwise
|
Examples
- segment_car_common_extremity([1 -2 4 -3], [1 2 3 4], 1, 1) = True, because they share
a end region composed of blocks 3 and 4.
- segment_car_common_extremity([1 -2 4 -3], [1 2 3 4], 0, 0) = True, because they share
a begin region composed of blocks 1 and 2.
- segment_car_common_extremity([1 -2 4 -3], [-3 4 2 -1], 0, 1) = True, because they do
share a same region that is the beginning of segment and the end of car
-
procars.step_modules.compute_conserved_adjacencies.write_discarded_adjs(filename, group1_adjs, group2_adjs, car_left, car_right, species1_id, species2_id, species3_id, cars, step_nb)[source]
Write adjacencies found for ancestor into the output file
Parameters: | filename : String
Name of the output file in which writting all adjacencies
group1_adjs : list
List of car_adjacencies fully conserved but conflicting
group2_adjs : list
List of car_adjacencies partly conserved and in conflict with fully or partly conserved adjs
car_left : list
For each car, left neighbor
car_right : list
For each car, right neighbor
species1_id : list
List of IDs of species in group1 of partition defined by the ancestor node
species2_id : list
List of IDs of species in group2 of partition defined by the ancestor node
species3_id : list
List of IDs of species in group3 of partition defined by the ancestor node
cars : list
step_nb : int
Current step of the ProCARs method
|
-
procars.step_modules.compute_conserved_adjacencies.write_retained_adjs(last_adjs, new_adjs, group1_adjs, group2_adjs, cars, step_nb, car_left, car_right, species1_id, species2_id, species3_id)[source]
Write adjacencies found for ancestor into the output file
Parameters: | last_adjs : String
Name of the file where previously added adjacencies are stored
new_adjs : String
Name of the file where new adjacencies must be written
group1_adjs : list
List of fully conserved non-conflicting adjacencies
group2_adjs : list
List of partly conserved non-conflicting adjacencies
cars : list
step_nb : int
Step of the ProCARs method
car_left : list
For each car, left neighbor
car_right : list
For each car, right neighbor
species1_id : list
List of IDs of species in group1 of partition defined by the ancestor node
species2_id : list
List of IDs of species in group2 of partition defined by the ancestor node
species3_id : list
List of IDs of species in group3 of partition defined by the ancestor node
|
compute_resolved_conflicts module
Copyright © Bonsai - LIFL (Université Lille 1, CNRS UMR 8022) and Inria-Lille Nord Europe
contact: aida.ouangraoua@inria.fr, amandine.perrin@inria.fr
This software is a computer program whose purpose is to progressively reconstruct ancestral
gene orders.
This software is governed by the CeCILL-B license under French law and
abiding by the rules of distribution of free software. You can use,
modify and/or redistribute the software under the terms of the CeCILL-B
license as circulated by CEA, CNRS and Inria at the following URL
http://www.cecill.info, or in the LICENCE file at the root directory of this program.
The fact that you are presently reading this means that you have had
knowledge of the CeCILL-B license and that you accept its terms.
compute_resolved_conflicts module description:
This module is used to resolve conflicts in a given set of adjacencies, when, at the previous
step, the set of non-conflicting adjacencies was empty.
Method:
From a set of conflicting conserved adjacencies with the set of species supporting each
adjacencies, it computes a maximum set of non-conflicting adjacencies.
We use the Sankoffs algorithm’s for the minimal mutation problem.
Module author: Aïda Ouangraoua, Amandine PERRIN
February 2014
-
procars.step_modules.compute_resolved_conflicts.change_adj_discarded_files(adj_prefix_file, step)[source]
Save discarded adjacencies into a new file for information, and remove them from last file
Parameters: | adj_prefix_file : string
Prefix of adjacency file name
step : int
Current step of ProCars method
|
-
procars.step_modules.compute_resolved_conflicts.compute_cost(tree, path, leaves, internals, labels, nb_labels)[source]
Function computing optimal labels for the ancestor node in a tree
Dynamic programming version of the Fitch Algorithm for the Small Parsimony Problem (Fitch 71)
Parameters: | tree: dict
Tree structure:
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
Where id = -1 if no parent/left_child/right_child
And node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
path : list
Array, associating value 1 to a node if it is on the path between ancestor and root node,
otherwise value 0
leaves : list
List of leaf IDs in the tree
internals : list
Ordered list of internal node IDs in the tree
labels : dict
Containing labels of leaves nodes: 0 if adjacency is absent, 1 if present
nb_labels : int
|
Returns: | list
tab: costs for each node of the tree
|
-
procars.step_modules.compute_resolved_conflicts.find_conflicts_couples(adj_ids)[source]
Find couples of adjacencies in conflict
Among all possible couples of adjacencies, finds the ones which involve a common end.
Parameters: | adj_ids : list
List of IDs such that adj_ids[id] = (bloc1, bloc2) with bloc1 and bloc2 the two block
numbers of the adjacency with id ‘id’
|
Returns: | tuple
conflicts: a list containing all couples of adjacencies IDs which are in conflict (because
they share one end).
|
-
procars.step_modules.compute_resolved_conflicts.find_independant_sets(nb_adjacencies, conflicts)[source]
Find sets of independent adjacencies
From the list of conflicting adjacencies, it tries to create sets by adding new adjacencies
in each set, which are compatible with all adjacencies already in the set.
The set of independent (= not in conflict) adjacencies is then growing at each iteration,
until we cannot add any more adjacency.
Parameters: | nb_adjacencies : int
Total number of adjacencies
conflicts : int
List of pairs of adjacencies IDs which are in conflict
|
Returns: | list
independent_sets: list of sets of compatible adjacencies. All sets have the same size,
which is the maximum possible size of an independent adjacencies set
|
-
procars.step_modules.compute_resolved_conflicts.find_mincost_set(independent_sets, adjacencies_score)[source]
Among all sets of compatible adjacencies of maximum size, find the one with minimum cost
Parameters: | independent_sets : list
List of sets of independent adjacencies IDs
adjacencies_score : list
List of scores of the adjacencies, such that adjacencies_score[id] = score of adjacency
with ID = id
|
Returns: | list
maximum_set: a list of compatible adjacencies, with minimum global cost
|
-
procars.step_modules.compute_resolved_conflicts.get_adjacencies_costs(adj_file, nb_species, leaves, tree, spe_ids, path, internals, ancestor_id)[source]
Read the file where discarded adjacencies were stored, and calculate the cost of each
of these adjacencies
Parameters: | adj_file : string
File in which discarded adjacencies are written
nb_species : int
leaves : list
List of IDs of tree leaves (= genomes)
tree : dict
Tree structure
{node_id: [node_name, parent_id, left_child_id, right_child_id], ...}
Where id = -1 if no parent/left_child/right_child
And node_name = “N” + node_id for internal nodes, or leaf_name for leaf nodes
spe_ids : dict
Species as keys, and their corresponding ID as value
path : list
Array, associating value 1 to a node if it is on the path between ancestor and root node,
otherwise value 0
internals : list
IDs of internal nodes of the tree
ancestor_id : int
|
Returns: | tuple
adjacencies_score: dict such that adjacencies_score[id] = score
of the adjacency with ID id
adj_ids: dict for corresponding adjacencies and adjacency IDs:
adj_ids[id] = (bloc1, bloc2)
step_car_adjs: dict such that step_car_adjs[id] = (car1, car2, type, step, labels)
where car1, car2 corresponds to the car adjacency corresponding
to the adjacency with ID id, type is the type of adjacency,
and step is the step at which the adjacency was found, and labels is some
information about the presence (1) or absence (0) of the adjacency in each species.
|
-
procars.step_modules.compute_resolved_conflicts.main(adjacency_file_name, step, tree_bin)[source]
Main method.
From a set of conflicting conserved adjacencies with the set of species supporting each
adjacencies, it computes a maximum set of non-conflicting adjacencies.
Parameters: | adjacency_file_name : string
Prefix of the adjacency file
step : int
Current step of the ProCARs method
tree_bin : string
Name of the binary file containing all information about the species tree.
|
Returns: | tuple
maximum_set: list
List of retained adjacency ids (one of the independent_sets)
independent_sets: list
List of compatible adjacency sets. All sets have the same size,
which is the maximum possible size of an independent adjacencies set
|
-
procars.step_modules.compute_resolved_conflicts.write_retained_adjs(adj_filename_prefix, adj_infos, maximum_set, adj_ids, step)[source]
Write all adjacencies in the retained set of non-conflicting adjacencies (maximum size
and minimum cost) into a new adjacency file.
Parameters: | adj_filename_prefix : string
Prefix of adjacency file name
adj_infos : dict
- Dictionary with adjacency IDs as keys, and values are a list with:
- the car adjacency (car1, car2)
- the type of adjacency
- the step at which it was found
- the presence of this adjacency in each species
maximum_set : list
List of retained adjacency ids
adj_ids : dict
Keys are adjacency IDs, and values are the adjacency corresponding to the ID
step : int
Current step of ProCars method
|
compute_dcj_adjacencies module
Copyright © Bonsai - LIFL (Université Lille 1, CNRS UMR 8022) and Inria-Lille Nord Europe
contact: aida.ouangraoua@inria.fr, amandine.perrin@inria.fr
This software is a computer program whose purpose is to progressively reconstruct ancestral
gene orders.
This software is governed by the CeCILL-B license under French law and
abiding by the rules of distribution of free software. You can use,
modify and/or redistribute the software under the terms of the CeCILL-B
license as circulated by CEA, CNRS and Inria at the following URL
http://www.cecill.info, or in the LICENCE file at the root directory of this program.
The fact that you are presently reading this means that you have had
knowledge of the CeCILL-B license and that you accept its terms.
compute_dcj_adjacencies module description:
This module is used to find DCJ-reliable adjacencies.
It is called when, at the previous step, both the sets of compatible adjacencies and
conflicting adjacencies are empty. Hence, we try to add new adjacencies by adding DCJ-reliable
ones, if possible.
Module author: Aïda Ouangraoua, Amandine PERRIN
February 2014
-
procars.step_modules.compute_dcj_adjacencies.compute_adjacencymatrix_from_adjacency_array(adjacencies, nb_blocks)[source]
Function computing an adjacency matrix from an adjacency array
Parameters: | adjacencies : list
Array containing pairs of signed blocks, with abs(block1) < abs(block2)
nb_blocks : int
|
Returns: | SparseMatrix
- adjacencymatrix: 0-1 Matrix which rows and columns correspond to block extremities:
- right extremity of bloc_id : extremity_id = block_id
- left extremity of bloc_id : extremity_id = block_id + nb_blocks
|
-
procars.step_modules.compute_dcj_adjacencies.compute_neighbor(spe, block, side, left, right)[source]
Finds the side (left (0) or right (1)) neighbor of the given block (block)
Parameters: | spe : int
Species id for which we want to find a neighbor
block : int
Block number of the species for which we want to find a neighbor
side : int
0 for left extremity of the given block of the given species, 1 for right extremity
left : list
Array such that left[spe_id][block_id] is an array containing the left
neighbor of block block_id in spe_id
right : list
Array such that right[spe_id][block_id] is an array containing the right
neighbor of block_id in spe_id
|
Returns: | int
neighbor: signed integer, neighboring block of the given block
|
-
procars.step_modules.compute_dcj_adjacencies.find_possible_dcj_adjs(nb_cars, possible_partitions, cars, left, right, car_left, car_right)[source]
From a set of CARs in the current ancestral PQtree, and the list of CAR and block
neighbors in all species, finds which CAR adjacencies are DCJ-reliable. A next step will
check if adjacencies in this step are not in conflict.
Parameters: | nb_cars : int
Total number of cars in the current PQtree
possible_partitions : list
List of partitions, which are ordered lists of lists species ids corresponding to different
combinations of the 3 groups partition defined by the ancestor
cars : list
List of lists of ordered signed block numbers in each car
left : list
Array such that left[spe_id][block_id] is an array containing the left
neighbor of block block_id in spe_id
right : list
Array such that right[spe_id][block_id] is an array containing the right
neighbor of block_id in spe_id
car_left : list
Array such that car_left[species_id][car_id] = list of potential left car
neighbors of car_id in species ‘species_id’
car_right : list
Array such that car_right[species_id][car_id] = list of potential right car
neighbors of car_id in species ‘species_id’
|
Returns: | list
car_adjacencies: list of lists (possible DCJ-reliable CAR adjacencies)
|
-
procars.step_modules.compute_dcj_adjacencies.is_adjacency_in_car(cars, block_adjacency)[source]
Checks if a block adjacency belongs to the CARs
Parameters: | cars : list
List of all CARs of the current PQtree
block_adjacency : tuple
One given block adjacency to search among the CARs
|
Returns: | boolean
True if the adjacency is found in the CARs, False otherwise
|
-
procars.step_modules.compute_dcj_adjacencies.is_car_adjacency(spe, car_adjacency, car_left, car_right)[source]
Checks if a given CAR adjacency is a possible car adjacency in the given species
Parameters: | spe : int
Species ID in which searching the car adjacency
car_adjacency : tuple
car_left : list
Array such that car_left[species_id][car_id] = list of potential left car
neighbors of car_id in species ‘species_id’
car_right : list
Array such that car_right[species_id][car_id] = list of potential right car
neighbors of car_id in species ‘species_id’
|
Returns: | boolean
True if car_adjacency belongs to spe, False otherwise
|
-
procars.step_modules.compute_dcj_adjacencies.is_dcj_reliable(possible_partitions, car_adjacency, cars, left, right, car_left, car_right)[source]
Finds signal from a given car_adjacency in all possible partitions of the species tree.
Parameters: | possible_partitions : list
List of partitions, which are ordered lists of species IDs lists corresponding to different
combinations of the 3 groups partition defined by the ancestor
car_adjacency : tuple
(car1, car2) car adjacency to test
cars : list
List of lists of ordered signed block numbers in each car
left : list
Array such that left[spe_id][block_id] is an array containing the left
neighbor of block block_id in spe_id
right : list
Array such that right[spe_id][block_id] is an array containing the right
neighbor of block_id in spe_id
car_left : list
Array such that car_left[species_id][car_id] = list of potential left car
neighbors of car_id in species ‘species_id’
car_right : list
Array such that car_right[species_id][car_id] = list of potential right car
neighbors of car_id in species ‘species_id’
|
Returns: | boolean
True if the adjacency is dcj-reliable and conserved, False otherwise
|
-
procars.step_modules.compute_dcj_adjacencies.is_dcj_reliable_conserved(partition, car_adjacency, block_adjacency, cars, left, right, car_left, car_right)[source]
Finds signal from a given car_adjacency in a given partition of the tree species.
A car adjacency is DCJ-reliable if the signal of this DCJ adjacency is conserved
on a path of the species tree that goes through the ancestor (from a group of the partition
defined by the ancestor to another group of this same partition)
Parameters: | partition : list
Ordered list of lists of species IDs corresponding to a given
combination of the 3 groups partition defined by the ancestor
car_adjacency : tuple
(car1, car2) car adjacency
block_adjacency : tuple
(bloc1, bloc2) block adjacency corresponding to the car adjacency
cars : list
List of lists of ordered signed block numbers in each car
left : list
Array such that left[spe_id][block_id] is an array containing the left
neighbor of block block_id in spe_id
right : list
Array such that right[spe_id][block_id] is an array containing the right
neighbor of block_id in spe_id
car_left : list
Array such that car_left[species_id][car_id] = list of potential left car
neighbors of car_id in species ‘species_id’
car_right : list
Array such that car_right[species_id][car_id] = list of potential right car
neighbors of car_id in species ‘species_id’
|
Returns: | boolean
True if the DCJ-reliable adjacency is conserved, False otherwise
|
-
procars.step_modules.compute_dcj_adjacencies.main(adjacency_file_name_prefix, step_nb, car_bin, neigh_bin, tree_bin, blocks_bin)[source]
Main method, to find DCJ-reliable adjacencies
Parameters: | adjacency_file_name_prefix : string
Prefix of the adjacncy file name
step_nb : int
Current step of the ProCars method
car_bin : string
Name of the binary file where all information about CARs is saved
neigh_bin : string
Name of the binary file where all information about the neighbors of each CAR is saved
tree_bin : string
Name of the binary file where all information about the species tree is saved
blocks_bin : string
Name of the binary file where all information about the blocks is saved
|
Returns: | tuple
step_nb: current step of the ProCars method
dcj_NC_Adj: list of all non-conflicting DCJ-reliable adjacencies
dcj_NC_Pair: list of matrix indexes of non-conflicting DCJ-adjacencies.
|
-
procars.step_modules.compute_dcj_adjacencies.write_retained_adjs(adj_filename_prefix, cars, dcj_NC_Adj, step, car_left, car_right, species1_id, species2_id, species3_id)[source]
Write all retained DCJ-reliable adjacencies into a txt file
Parameters: | adj_filename_prefix : string
Prefix of the file where retained adjacencies are stored
cars : list
List of ordered lists of signed blocks corresponding to the different cars, such that
cars[car_id] = ordered list of blocks in car_id
dcj_NC_Adj : list
List of retained compatible set of dcj-reliable adjacencies
step : int
Current step in the whole method
car_left : list
Array such that car_left[species_id][car_id] = list of potential left car
neighbors of car_id in species ‘species_id’
car_right : list
Array such that car_right[species_id][car_id] = list of potential right car
neighbors of car_id in species ‘species_id’
species1_id : list
List of IDs of species in group1 of partition defined by the ancestor node
species2_id : list
List of IDs of species in group2 of partition defined by the ancestor node
species3_id : list
List of IDs of species in group3 of partition defined by the ancestor node
|
compute_pqtree module
Copyright © Bonsai - LIFL (Université Lille 1, CNRS UMR 8022) and Inria-Lille Nord Europe
contact: aida.ouangraoua@inria.fr, amandine.perrin@inria.fr
This software is a computer program whose purpose is to progressively reconstruct ancestral
gene orders.
This software is governed by the CeCILL-B license under French law and
abiding by the rules of distribution of free software. You can use,
modify and/or redistribute the software under the terms of the CeCILL-B
license as circulated by CEA, CNRS and Inria at the following URL
http://www.cecill.info, or in the LICENCE file at the root directory of this program.
The fact that you are presently reading this means that you have had
knowledge of the CeCILL-B license and that you accept its terms.
compute_pqtree module description:
This module creates a new CAR file, from a number of blocks and a set of adjacencies
between those blocks.
The CAR file has the following format:
#CAR1
_Q bloc1 bloc2 bloc3 ... blocn Q_
#CAR2
_Q bloc1 -bloc2 ... Q_
...
Module author: Aïda Ouangraoua, Amandine PERRIN
February 2014
-
procars.step_modules.compute_pqtree.find_cars(nb_blocks, left, right, adjacency_prefix, step_nb)[source]
From the list of added adjacencies, finds all CARs to put into the PQtree file
Parameters: | nb_blocks : int
left : dict
Dictionnary such that left[abs(block_num)] = signed block_number of block on the left end of
block_num (if block_num > 0: previous block, else : next block). Left blocks calculated
according to the current step adjacency file. If no left block: 0
right : dict
Dictionnary such that right[abs(block_num)] = signed block_number of block on the right end
of block_num (if block_num > 0: next block, else : previous block). Right blocks
calculated according to the current step adjacency file. If no right block: 0
adjacency_prefix : string
Prefix of all adjacencies files
step_nb : int
Current step of ProCARs method
|
Returns: | list
genome: List of lists, such that:
all_cars[car_num] = [bloc1, bloc2, ...] = all ordered signed blocks of car number car_num
|
-
procars.step_modules.compute_pqtree.init_read_from_block_in_cycle(signed_block, nex_signed_block, next_prev_signed_block, next_prev_discarded_signed_block, end)[source]
Initialization of the resolution of cycle CARs
Checks if the block on the end extremity of the given signed_block was also found at last
step or not.
- if it was found and added at last step, keep it
- if it was found but discarded in last step, keep it, but also store it in
to_be_discarded_unresolved in order to remove it if the cycle is still unresolved
- if it was not found at last step, remove it: store it to to_be_discarded_resolved
Parameters: | signed_block : int
Signed block for which we are looking for the adjacency with its next block.
nex_signed_block : int
- Next block of the given signed_block.
- If we look on the left end and signed_block > 0, nex_signed_block =
previous[signed_block]
- If we look on the left end and signed_block < 0, nex_signed_block =
-next[-signed_block]
- If we look on the right end and signed_block > 0, nex_signed_block =
next[signed_block]
- If we look on the right end and signed_block < 0, nex_signed_block =
-previous[-signed_block]
next_prev_signed_block : int
Same as nex_signed_block but for adjacencies added at last step
next_prev_discarded_signed_block : int
Same as nex_signed_block but for adjacencies discarded at last step
end : int
0 for left end, 1 for right end
|
Returns: | tuple
to_be_discarded_unresolved: List containing adjacencies that we add to the current
CAR because they were discarded at the previous step, but that we will remove if we find
a cycle CAR again (unresolved cycle)
to_be_discarded_resolved: List of adjacencies found at this step but not at last step,
and hence to remove from the adjacency file
current_signed_block: New current signed block (next block of signed_block) if the
adjacency was found in last step, or 0 if it was not.
|
-
procars.step_modules.compute_pqtree.main(nb_blocks, adjacency_prefix, car_prefix, step_nb)[source]
Main method, to create the new PQtree file.
Parameters: | nb_blocks : int
adjacency_prefix : string
Prefix of the adjacency file name
car_prefix : string
Prefix of the PQtree file name
step_nb : int
Current step of the ProCARs method
|
-
procars.step_modules.compute_pqtree.read_from_block(signed_block, left, right, end)[source]
From a signed block, and a list of all blocks neighbors, finds a chain of blocks
forming a CAR from the given block towards the given end (left or right)
Parameters: | signed_block : int
ID of the given block, with its orientation (sign)
left : dict
Dictionnary such that left[abs(block_num)] = signed block_number of block on the left end of
block_num (if block_num > 0: previous block, else : next block). Left blocks calculated
according to the current step adjacency file. If no left block: 0
right : dict
Dictionnary such that right[abs(block_num)] = signed block_number of block on the right end
of block_num (if block_num > 0: next block, else : previous block). Right blocks
calculated according to the current step adjacency file. If no right block: 0
end : int
0 for left extremity of the block, 1 for right extremity
|
Returns: | list
car: ordered list of signed blocks in the required CAR or empty list if cycle CAR
|
-
procars.step_modules.compute_pqtree.read_from_block_in_cycle(signed_block, left, right, left_prev, right_prev, left_prev_discarded, right_prev_discarded, end)[source]
Finds a non-cyclic CAR
From a signed_block, which was found to be in a cycle CAR if we added all adjacencies
found at the current step, it tries to find a non cycle CAR
by requiring a new condition: add neighbor in the CAR if it was also found as a conserved
adjacency at last step (in added adjacencies or discarded ones).
Parameters: | signed_block : int
Number of the given block, with its orientation (sign)
left : dict
Dictionnary such that left[abs(block_num)] = signed block_number of block on the left end of
block_num (if block_num > 0: previous block, else : next block). Left blocks calculated
according to the current step adjacency file. If no left block: 0
right : dict
Dictionnary such that right[abs(block_num)] = signed block_number of block on the right end
of block_num (if block_num > 0: next block, else : previous block). Right blocks
calculated according to the current step adjacency file. If no right block: 0
left_prev : dict
Same as left, but for adjacencies found in last step
right_prev : dict
Same as right, but for adjacencies found in last step
left_prev_discarded : dict
For each block number, an array containing its potential left neighbors:
left[bloc1] = [bloc2, -bloc3,..]. According to adjacencies discarded at the last step
right_prev_discarded : dict
For each block number, an array containing its potential right neighbors:
right[bloc1] = [-bloc2, bloc3,..]. According to adjacencies discarded at the last step
end : int
0 for left extremity of the block, 1 for right extremity
|
Returns: | list
car: ordered list of signed blocks in the required CAR. If the cycle is unresolved, it
returns the cycle CAR. Else, it returns the part of the CAR from the block signed_block
until the end extremity of the CAR.
to_be_discarded: adjacencies to remove from the current set of adjacencies, because of the
resolution of the cycle CAR.
|
-
procars.step_modules.compute_pqtree.remove_adjacency_from_file(adjacency_file_name, adjacency_list)[source]
Removes an adjacency in an adjacency file
When all conserved non-conflicting adjacencies are found, this module computes the sets of
blocks into CARs. If adjacencies added infer a cycle CAR, we eliminate one of these
adjacencies (the less reliable) in order to form a linear chromosome. The removed adjacency
must then be removed from the adjacency file.
Parameters: | adjacency_file_name : string
Name of the current step adjacency file
adjacency_list : list
List of lists: adjacencies to remove from the file.
|