[Get it solved] The main corpus for this assignment comes from the offici...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

The main corpus for this assignment comes from the official records (Hansards) of the 36th Canadian Parliament, including debates from both the House of Representatives and the Senate.

computer science

Description

Overview 1.1 Canadian Hansards

The main corpus for this assignment comes from the official records (Hansards) of the 36th Canadian Parliament, including debates from both the House of Representatives and the Senate. This corpus is available at /u/cs401/A2/data/Hansard/ and has been split into Training/ and Testing/ directories. This data set consists of pairs of corresponding files (*.e is the English equivalent of the French *.f) in which every line is a sentence. Here, sentence alignment has already been performed for you. That is, the n th sentence in one file corresponds to the n th sentence in its corresponding file (e.g., line n in fubar.e is aligned with line n in fubar.f). Note that this data only consists of sentence pairs; many-to-one, many-to-many, and one-to-many alignments are not included.

1.2 Seq2seq We will be implementing a simple seq2seq model, with and without attention, based largely on the course material. You will train the models with teacher-forcing and decode using beam search. We will write it in PyTorch version 1.2.0 (https://pytorch.org/docs/1.2.0/), which is the version installed on the teach.cs servers. For those unfamiliar with PyTorch, we suggest you first read the PyTorch tutorial (https: //pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html).

1.3 Tensors and batches PyTorch, like many deep learning frameworks, operate with tensors, which are multi-dimensional arrays. When you work in PyTorch, you will rarely if ever work with just one bitext pair at a time. You’ll instead be working with multiple sequences in one tensor, organized along one dimension of the batch. This means that a pair of source and target tensors F and E actually correspond to multiple sequences F = (F (n) 1:S(n) )n∈[1,N] , E = (E (n) 1:T(n) )n∈[1,N] . We work with batches instead of individual sequences because: a) backpropagating the average gradient over a batch tends to converge faster than single samples, and b) sample computations can be performed in parallel. For example, if we want to multiply source sequences F (n) and F (n+1) with an embedding matrix W, we can tell one CPU core to compute the result for F (n) and another for F (n+1), halving the overall time it would take to multiply them independently. Learning to work with tensors can be difficult at first, but is integral to efficient computation. We suggest you read more about it in the NumPy docs (https://docs.scipy.org/doc/numpy/user/theory.broadcasting. html#array-broadcasting-in-numpy), which PyTorch borrows for tensors.

Related Questions in computer science category

Write a program called FindPath.java to solve the following problem.

Your firm provides remote medical consultancies. It has recently announced intent to expand their operations.

Write a program to read in a postfix expression and output a corresponding infix expression

Network Security assignment help

This assignment continues the use of linked lists as queues to implement simple scheduling policies.

This assignment aims to give you real problem-solving experience, like what you might encounter in the workplace

Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class.

For both small businesses and large corporations, social media is playing a key role in brand building and customer communication.

Information Management, Information Systems, and Information Technology

Write a Java program assignment4.java, that takes three command line arguments: two csv files (samples given below), and a function name. Based on the function name, your program output changes.

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

Get Free Quote!

267 Experts Online

Connect With Us

We Provide Services Across The Globe

Disclaimer: The reference papers or solutions provided by Calltutors.com serve as model papers or solutions for students or professionals and are not to be submitted as it is to any institutions. These documents are intended to be used for research and reference purposes only. University and company's logo's are the property of respected owners. We don't have affiliation with the mentioned universities. By using our services means, you agree to our Honor Code , Privacy Policy , Terms & Conditions , Payment , Refund & Cancellation Policy.

Enroll in the complete course for only $250 USD*

The main corpus for this assignment comes from the official records (Hansards) of the 36th Canadian Parliament, including debates from both the House of Representatives and the Senate.

computer science

Description

Get instant assignment help service

Related Questions in computer science category

Policy

Exploring

Other

Connect With Us

We Provide Services Across The Globe