Uncategorized

Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called acluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning,pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest. This often leads to misunderstandings between researchers coming from the fields of data mining and machine learning[citation needed], since they use the same terms and often the same algorithms, but have different goals.

Standard
Uncategorized

Caltech 101 DataSet For Project

Description

Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. Collected in September 2003 by Fei-Fei Li, Marco Andreetto, and Marc ‘Aurelio Ranzato.  The size of each image is roughly 300 x 200 pixels.
We have carefully clicked outlines of each object in these pictures, these are included under the ‘Annotations.tar’. There is also a matlab script to view the annotaitons, ‘show_annotations.m’.

How to use the dataset

If you are using the Caltech 101 dataset for testing your recognition algorithm you should try and make your results comparable to the results of others. We suggest training and testing on fixed number of pictures and repeating the experiment with different random selections of pictures in order to obtain error bars. Popular number of training images: 1, 3, 5, 10, 15, 20, 30. Popular numbers of testing images: 20, 30. See also the discussion below.
When you report your results please keep track of which images you used and which were misclassified. We will soon publish a more detailed experimental protocol that allows you to report those details. See the Discussion section for more details.

Standard
Uncategorized

Deep Learning

One of machine learning’s core goals is classification of input data. This is the task of taking novel data and assigning it to one of a pre-determined number of labels, based on what the classifier learns from a training set. For instance, a classifier could take an image
and predict whether it is a catsimple image classifier diagram or a dog.

The pieces of information fed to a classifier for each data point are called features, and the category they belong to is a ‘target’ or ‘label’. Typically, the classifier is given data points with both features and labels, so that it can learn the correspondence between the two. Later, the classifier is queried with a data point and the classifier tries to predict what category it belongs to. A large group of these query data-points constitute a prediction-set, and the classifier is usually evaluated on its accuracy, or how many prediction queries it gets correct.

There are many methods to perform classification, such as SVMs, logistic regression, deep learning, and more.

Stage 1: Train a DNN classifier on a large, general dataset. A good example is ImageNet ,with 1000 categories and 1.2 million images.

  • Stage 2: The outputs of each layer in the DNN can be viewed as a meaningful vector representaion of each image. Extract these feature vectors from the layer prior to the output layer on each image of your task.
  • Stage 3: Train a new classifier with those features as input for your own task.
Standard
Uncategorized

Python Objects and Class

Python is an object oriented programming language. Unlike procedure oriented programming, in which the main emphasis is on functions, object oriented programming stress on objects. Object is simply a collection of data (variables) and methods (functions) that act on those data.

Class is a blueprint for the object. We can think of class like a sketch (prototype) of a house. It contains all the details about the floors, doors, windows etc. Based on these descriptions we build the house. House is the object. As, many houses can be made from a description, we can create many objects from a class. An object is also called an instance of a class and the process of creating this object is called instantiation.

Defining a Class in Python

Like function definitions begin with the keyword def, in Python, we define a class using the keyword class. The first string is called docstring and has a brief description about the class. Although not mandatory, this is recommended. Here is a simple class definition.


class MyNewClass:
    '''This is a docstring. I have created a new class'''
    pass

A class creates a new local namespace where all its attributes are defines. Attributes may be data or functions. There are also special attributes in it that begins with double underscores (__). For example, __doc__ gives us the docstring of that class. As soon as we define a class, a new class object is created with the same name. This class object allows us to access the different attributes as well as to instantiate new objects of that class.


>>> class MyClass:
...     "This is my second class"
...     a = 10
...     def func(self):
...         print('Hello')
...   
>>> MyClass.a
10
Standard
Uncategorized

Python Sets

Set is an unordered collection of items. Every element is unique (no duplicates) and must be immutable. However, the set itself is mutable (we can add or remove items). Sets can be used to perform mathematical set operations like union, intersection, symmetric difference etc.

Creating a Set in Python

A set is created by placing all the items (elements) inside curly braces {}, separated by comma or by using the built-in function set(). It can have any number of items and they may be of different types (integer, float, tuple, string etc.). But a set cannot have a mutable element, like list, set or dictionary, as its element.


>>> # set of integers
>>> my_set = {1, 2, 3}

>>> # set of mixed datatypes
>>> my_set = {1.0, "Hello", (1, 2, 3)}

>>> # set donot have duplicates
>>> {1,2,3,4,3,2}
{1, 2, 3, 4}

>>> # set cannot have mutable items
>>> my_set = {1, 2, [3, 4]}
...
TypeError: unhashable type: 'list'

>>> # but we can make set from a list
>>> set([1,2,3,2])
{1, 2, 3}

Creating an empty set is a bit tricky. Empty curly braces {} will make an empty dictionary in Python. To make a set without any elements we use the set()function without any argument.


>>> a = {}
>>> type(a)
<class 'dict'>
>>> a = set()
>>> type(a)
<class 'set'>

Changing a Set in Python

Sets are mutable. But since they are unordered, indexing have no meaning. We cannot access or change an element of set using indexing or slicing. Set does not support it. We can add single elements using the method add(). Multiple elements can be added using update() method. The update() method can take tuples, lists, strings or other sets as its argument. In all cases, duplicates are avoided.


>>> my_set = {1,3}
>>> my_set[0]
...
TypeError: 'set' object does not support indexing
>>> my_set.add(2)
>>> my_set
{1, 2, 3}
>>> my_set.update([2,3,4])
>>> my_set
{1, 2, 3, 4}
>>> my_set.update([4,5], {1,6,8})
>>> my_set
{1, 2, 3, 4, 5, 6, 8}

Removing Elements from a Set

A particular item can be removed from set using methods like discard() andremove(). The only difference between the two is that, while using discard() if the item does not exist in the set, it remains unchanged. But remove() will raise an error in such condition. The following example will illustrate this.


>>> my_set = {1, 3, 4, 5, 6}
>>> my_set.discard(4)
>>> my_set
{1, 3, 5, 6}
>>> my_set.remove(6)
>>> my_set
{1, 3, 5}
>>> my_set.discard(2)
>>> my_set
{1, 3, 5}
>>> my_set.remove(2)
...
KeyError: 2
Standard
Uncategorized

Python Dictionary

Python dictionary is an unordered collection of items. While other compound datatypes have only value as an element, a dictionary has a key: value pair. Dictionaries are optimized to retrieve values when the key is known.

Creating a Dictionary

Creating a dictionary is as simple as placing items inside curly braces {} separated by comma. An item has a key and the corresponding value expressed as a pair, key: value. While values can be of any datatype and can repeat, keys must be of immutable type (string, number or tuple with immutable elements) and must be unique. We can also create a dictionary using the built-in functiondict().


# empty dictionary
my_dict = {}

# dictionary with integer keys
my_dict = {1: 'apple', 2: 'ball'}

# dictionary with mixed keys
my_dict = {'name': 'John', 1: [2, 4, 3]}

# using dict()
my_dict = dict({1:'apple', 2:'ball'})

# from sequence having each item as a pair
my_dict = dict([(1,'apple'), (2,'ball')])

Accessing Elements in a Dictionary

While indexing is used with other container types to access values, dictionary uses keys. Key can be used either inside square brackets or with the get()method. The difference while using get() is that it returns None instead ofKeyError, if the key is not found.


>>> my_dict = {'name':'Ranjit', 'age': 26}
>>> my_dict['name']
'Ranjit'

>>> my_dict.get('age')
26

>>> my_dict.get('address')

>>> my_dict['address']
...
KeyError: 'address'

Changing or Adding Elements in a Dictionary

Dictionary are mutable. We can add new items or change the value of existing items using assignment operator. If the key is already present, value gets updated, else a new key: value pair is added to the dictionary.


>>> my_dict
{'age': 26, 'name': 'Ranjit'}

>>> my_dict['age'] = 27  # update value
>>> my_dict
{'age': 27, 'name': 'Ranjit'}

>>> my_dict['address'] = 'Downtown'  # add item
>>> my_dict
{'address': 'Downtown', 'age': 27, 'name': 'Ranjit'}

Deleting or Removing Elements from a Dictionary

We can remove a particular item in a dictionary by using the method pop(). This method removes as item with the provided key and returns the value. The method, popitem() can be used to remove and return an arbitrary item (key, value) form the dictionary. All the items can be removed at once using the clear()method. We can also use the del keyword to remove individual items or the entire dictionary itself.


>>> squares = {1:1, 2:4, 3:9, 4:16, 5:25}  # create a dictionary

>>> squares.pop(4)  # remove a particular item
16
>>> squares
{1: 1, 2: 4, 3: 9, 5: 25}

>>> squares.popitem()  # remove an arbitrary item
(1, 1)
>>> squares
{2: 4, 3: 9, 5: 25}

>>> del squares[5]  # delete a particular item
>>> squares
{2: 4, 3: 9}

>>> squares.clear()  # remove all items
>>> squares
{}

>>> del squares  # delete the dictionary itself
>>> squares

 

Standard
Uncategorized

Python Strings

String is a sequence of characters. A character is simply a symbol. For example, the English language has 26 characters. Computers do not deal with characters, they deal with numbers (binary). Even though you may see characters on your screen, internally it is stored and manipulated as a combination of 0’s and 1’s. This conversion of character to a number is called encoding, and the reverse process is decoding. ASCII and Unicode are some of the popular encoding used.

Creating a String

Strings can be created by enclosing characters inside a single quote or double quotes. Even triple quotes can be used in Python but generally used to represent multiline strings and docstrings.


# all of the following are equivalent
my_string = 'Hello'
my_string = "Hello"
my_string = '''Hello'''
my_string = """Hello"""

# triple quotes string can extend multiple lines
my_string = """Hello, welcome to
           the exciting world
           of string in Python"""

Accessing Characters in a String

We can access individual characters using indexing and a range of characters using slicing. Index starts from 0. Trying to access a character out of index range will raise an IndexError. The index must be an integer. We can’t use float or other types, this will result into TypeError.

Python allows negative indexing for its sequences. The index of -1 refers to the last item, -2 to the second last item and so on. We can access a range of items in a string by using the slicing operator (colon).


>>> my_string = 'programiz'
>>> my_string[0]   # 1st character
'p'
>>> my_string[-1]  # last character
'z'
>>> my_string[15]  # index must be in range
...
IndexError: string index out of range
>>> my_string[1.5] # index must be an integer
...
TypeError: string indices must be integers
>>> my_string[1:5]  # slicing 2nd to 5th character
'rogr'
>>> my_string[5:-2] # slicing 6th to 7th character
'am'

Slicing can be best visualized by considering the index to be between the elements as shown below. So if we want to access a range, we need the index that will slice the portion from the string.

Element Slicing in Python

Changing or Deleting a String

Strings are immutable. This means that elements of a string cannot be changed once it has been assigned. We can simply reassign different strings to the same name.


>>> my_string = 'programiz'
>>> my_string[5] = 'a'
...
TypeError: 'str' object does not support item assignment
>>> my_string = 'Python'
>>> my_string
'Python'

We cannot delete or remove characters from a string. But deleting the string entirely is possible using the keyword del.


>>> del my_string[1]
...
TypeError: 'str' object doesn't support item deletion
>>> del my_string
>>> my_string
...
NameError: name 'my_string' is not defined

Python String Operations

There are many operations that can be performed with string which makes it one of the most used datatypes in Pyhon.

Concatenation

Joining of two or more strings into a single one is called concatenation. The +operator does this in Python. Simply writing two string literals together also concatenates them. The * operator can be used to repeat the string for a given number of times. Finally, if we want to concatenate strings in different lines, we can use parentheses.


>>> # using +
>>> 'Hello ' + 'World!'
'Hello World!'

>>> # two string literals together
>>> 'Hello ''World!'
'Hello World!'

>>> # using *
>>> 'Hello ' * 3
'Hello Hello Hello '

>>> # using parentheses
>>> s = ('Hello '
...      'World')
>>> s
'Hello World'

Iterating Through String

Using for loop we can iterate through a string. Here is an example to count the number of ‘l’ in a string.


>>> count = 0
>>> for letter in 'Hello World':
...     if(letter == 'l'):
...         count += 1
...        
>>> print(count,'letters found')
3 letters found

String Membership Test

We can test if a sub string exists within a string or not, using the keyword in.


>>> 'a' in 'program'
True
>>> 'at' not in 'battle'
False
Standard
Uncategorized

Tuples

In Python programming, tuple is similar to a list. The difference between the two is that we cannot change the elements of a tuple once it is assigned whereas in a list, elements can be changed.

Creating a Tuple

A tuple is created by placing all the items (elements) inside a parentheses (), separated by comma. The parentheses are optional but is a good practice to write it. A tuple can have any number of items and they may be of different types (integer, float, list, string etc.).

# empty tuple
my_tuple = ()

# tuple having integers
my_tuple = (1, 2, 3)

# tuple with mixed datatypes
my_tuple = (1, "Hello", 3.4)

# nested tuple
my_tuple = ("mouse", [8, 4, 6], (1, 2, 3))

# tuple can be created without parentheses
# also called tuple packing
my_tuple = 3, 4.6, "dog"
# tuple unpacking is also possible
a, b, c = my_tuple

Creating a tuple with one element is a bit tricky. Having one element within parentheses is not enough. We will need a trailing comma to indicate that it is in fact a tuple.

>>> my_tuple = ("hello")   # only parentheses is not enough
>>> type(my_tuple)
<class 'str'>
>>> my_tuple = ("hello",)  # need a comma at the end
>>> type(my_tuple)
<class 'tuple'>
>>> my_tuple = "hello",    # parentheses is optional
>>> type(my_tuple)
<class 'tuple'>

Accessing Elements in a Tuple

There are various ways in which we can access the elements of a tuple.

Indexing

We can use the index operator [] to access an item in a tuple. Index starts from 0. So, a tuple having 6 elements will have index from 0 to 5. Trying to access an element other that this will raise an IndexError. The index must be an integer. We can’t use float or other types, this will result into TypeError. Nested tuple are accessed using nested indexing.

>>> my_tuple = ['p','e','r','m','i','t']
>>> my_tuple[0]
'p'
>>> my_tuple[5]
't'
>>> my_tuple[6]   # index must be in range
...
IndexError: list index out of range
>>> my_tuple[2.0] # index must be an integer
...
TypeError: list indices must be integers, not float
>>> n_tuple = ("mouse", [8, 4, 6], (1, 2, 3))
>>> n_tuple[0][3]  # nested index
's'
>>> n_tuple[1][1]  # nested index
4
>>> n_tuple[2][0]  # nested index
1

Negative Indexing

Python allows negative indexing for its sequences. The index of -1 refers to the last item, -2 to the second last item and so on.

>>> my_tuple = ['p','e','r','m','i','t']
>>> my_tuple[-1]
't'
>>> my_tuple[-6]
'p'

Slicing

We can access a range of items in a tuple by using the slicing operator (colon).

>>> my_tuple = ('p','r','o','g','r','a','m','i','z')
>>> my_tuple[1:4]  # elements 2nd to 4th
('r', 'o', 'g')
>>> my_tuple[:-7]  # elements beginning to 2nd
('p', 'r')
>>> my_tuple[7:]   # elements 8th to end
('i', 'z')
>>> my_tuple[:]    # elements beginning to end
('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')

Slicing can be best visualized by considering the index to be between the elements as shown below. So if we want to access a range, we need the index that will slice the portion from the tuple.

Element Slicing in Python

Changing or Deleting a Tuple

Unlike lists, tuples are immutable. This means that elements of a tuple cannot be changed once it has been assigned. But if the element is itself a mutable datatype like list, its nested items can be changed. We can also assign a tuple to different values (reassignment).

>>> my_tuple = (4, 2, 3, [6, 5])
>>> my_tuple[1] = 9  # we cannot change an element
...
TypeError: 'tuple' object does not support item assignment
>>> my_tuple[3] = 9  # we cannot change an element
...
TypeError: 'tuple' object does not support item assignment
>>> my_tuple[3][0] = 9   # but item of mutable element can be changed
>>> my_tuple
(4, 2, 3, [9, 5])
>>> my_tuple = ('p','r','o','g','r','a','m','i','z') # tuples can be reassigned
>>> my_tuple
('p', 'r', 'o', 'g', 'r', 'a', 'm', 'i', 'z')

We can use + operator to combine two tuples. This is also called concatenation. The * operator repeats a tuple for the given number of times. These operations result into a new tuple.

>>> (1, 2, 3) + (4, 5, 6)
(1, 2, 3, 4, 5, 6)
>>> ("Repeat",) * 3
('Repeat', 'Repeat', 'Repeat')

We cannot delete or remove items from a tuple. But deleting the tuple entirely is possible using the keyword del.

>>> my_tuple = ('p','r','o','g','r','a','m','i','z')
>>> del my_tuple[3] # can't delete items
...
TypeError: 'tuple' object doesn't support item deletion
>>> del my_tuple    # can delete entire tuple
>>> my_tuple
...
NameError: name 'my_tuple' is not defined
Standard
Uncategorized

Lists

Changing or Adding Elements to a List

List are mutable, meaning, their elements can be changed unlike string or tuple. We can use assignment operator (=) to change an item or a range of items.


>>> odd = [2, 4, 6, 8]    # mistake values
>>> odd[0] = 1            # change the 1st item
>>> odd
[1, 4, 6, 8]
>>> odd[1:4] = [3, 5, 7]  # change 2nd to 4th items
>>> odd                   # changed values
[1, 3, 5, 7]

We can add one item to a list using append() method or add several items usingextend() method.


>>> odd
[1, 3, 5]
>>> odd.append(7)
>>> odd
[1, 3, 5, 7]
>>> odd.extend([9, 11, 13])
>>> odd
[1, 3, 5, 7, 9, 11, 13]

We can also use + operator to combine two lists. This is also called concatenation. The * operator repeats a list for the given number of times.


>>> odd
[1, 3, 5]
>>> odd + [9, 7, 5]
[1, 3, 5, 9, 7, 5]
>>> ["re"] * 3
['re', 're', 're']

Furthermore, we can insert one item at a desired location by using the methodinsert() or insert multiple items by squeezing it into an empty slice of a list.


>>> odd
[1, 9]
>>> odd.insert(1,3)
>>> odd
[1, 3, 9]
>>> odd[2:2] = [5, 7]
>>> odd
[1, 3, 5, 7, 9]

Deleting or Removing Elements from a List

We can delete one or more items from a list using the keyword del. It can even delete the list entirely.


>>> my_list = ['p','r','o','b','l','e','m']
>>> del my_list[2]    # delete one item
>>> my_list
['p', 'r', 'b', 'l', 'e', 'm']
>>> del my_list[1:5]  # delete multiplt items
>>> my_list
['p', 'm']
>>> del my_list       # delete entire list
>>> my_list
...
NameError: name 'my_list' is not defined

We can use remove() method to remove the given item or pop() method to remove an item at the given index. The pop() method removes and returns the last item if index is not provided. This helps us implement lists as stacks (first in, last out data structure). We can also use the clear() method to empty a list.


>>> my_list = ['p','r','o','b','l','e','m']
>>> my_list.remove('p')
>>> my_list
['r', 'o', 'b', 'l', 'e', 'm']
>>> my_list.pop(1)
'o'
>>> my_list
['r', 'b', 'l', 'e', 'm']
>>> my_list.pop()
'm'
>>> my_list
['r', 'b', 'l', 'e']
>>> my_list.clear()
>>> my_list
[]

Finally, we can also delete items in a list by assigning an empty list to a slice of elements.


>>> my_list = ['p','r','o','b','l','e','m']
>>> my_list[2:3] = []
>>> my_list
['p', 'r', 'b', 'l', 'e', 'm']
>>> my_list[2:5] = []
>>> my_list
['p', 'r', 'm']
Standard