联系方式

您当前位置:首页 >> Java编程Java编程

日期:2024-03-26 08:34

Task 1 - Secret Messages

Can you guess what the encoded message below says?

Tha rein in Spein fells meinly in tha mounteins, not pleins

If you got it, nice work! If not, don’t worry – you’ll have a program to do this very

soon! The answer is, "The rain in Spain falls mainly in the mountains, not plains", and

we can get this by replacing all of the e's in the encoded message with a's, and viceversa. (i.e. A ↔ E)

Let’s try another one. Can you guess what the message below says?

I lofe ctudying artivisial intelligense

This one is harder, because there are two pairs of swapped letters: V ↔ F and S ↔ C.

If we reverse these swaps, then the answer is, "I love studying artificial intelligence",

(which we really hope is true! ). One more puzzle: if you start with the message,

"Cabs are taxis.", and you apply the swaps A ↔ B, and then B ↔ C, what encoded

message would you get?

The answer is, "Bcas cre tcxis"

• Start: Cabs are taxis

• Swap A ↔ B: Cbas bre tbxis (adjust colour)

• Swap B ↔ C: Bcas cre tcxis (adjust colour)

Notice that the encoded messages so far still resemble the original message,

because we haven’t swapped many letters. However, if we continue to add swaps,

the messages will become harder to read, so it would be nice to have a program to

help us out.

For this task, you will write a function to encode and decode messages using the

above letter swapping method (which is the how the secret message in the

introduction was encoded). The function should have three parameters:

1. A string specifying the key (i.e. the sequence of letter swaps). For example,

"AEGHAG", would mean we should apply the swaps A ↔ E, G ↔ H, then A ↔

G if we’re encoding, or the reverse (A ↔ G, G ↔ H, then A ↔ E) if we’re

decoding. Note that "AEGHAG" is the same as "EAGHAG", since A ↔ E is the

same as E ↔ A.

2. The name of a text file containing the message to be encoded or decoded.

3. Either 'e' or 'd' indicating whether to encode or decode, respectively.

The function will return the resulting encoded or decoded message as a string, with

capitalisation, punctuation and spacing preserved. Here are some example calls to

the function:

>>> print(task1('AE','spain.txt','d'))

TheraininSpainfallsmainlyinthemountains,notplains.

>>> print(task1('VFSC','ai.txt','d'))

Ilovestudyingartificialintelligence.

>>> print(task1('ABBC','cabs_plain.txt','e'))

Bcascretcxis.

Task 2 - Search Space

Congratulations! We can now encrypt and decrypt messages if we have the key (i.e.

the sequence of letters to swap). However, what happens if we don’t have the key?

Well, as the name of this assignment suggests, we’ll have to search for one! In this

task, we’ll look at how we can represent our search space as a tree and we’ll also

work on a program to generate child nodes for that tree. This will be very helpful

when we come to implement our search algorithms later.

Before starting, let’s revise the key elements of a search problem from the lecture

slides:

Figure 1: The four elements of search problem formulation (COMP3308/3608 W2 slides)

In our case, the initial state is the encrypted message. Can you work out what each

of the other elements (i.e. goal state, operators and path cost function) should be?

The answers are—wait! Are you sure you want to read on? Thinking about these

questions is a great exercise (and helpful for the exam  ). If yes, the answers are as

follows: 1) the goal state is the decoded message, 2) the operators are the letter

swaps (e.g. A ↔ E), since these transform messages into other messages and 3) the

path cost is the number of letter swaps (e.g. if we applied A ↔ E, then E ↔ B, that

would have a cost of 2.

Now that we have formulated our search problem, we can start setting up tools to

help us with the search. In this task, you will write a function to find all of the

successors of a state in our search space, given a set of allowed letters to swap. The

function should have two parameters:

1. The name of a text file containing the parent state

2. A string containing all letters that are allowed to be swapped. For example,

“ABC” would mean A ↔ B, A ↔ C and B ↔ C are allowed, but nothing else.

Note that we are adding this condition so we can make the state space

smaller, which will help with debugging. This will also be useful when we

come to decoding the secret message.

The function will return a string which includes the number of successor states,

followed by a list of these states separated by lines. The successors should be

generated by applying the allowed operators in alphabetical order. For example, all of

the A swaps (e.g. A ↔ B, A ↔ C, A ↔ D… etc.) should come before the B swaps (e.g.

B ↔ C, B ↔ D, B ↔ E etc.). Additionally, A ↔ B should come before A ↔ C., since B

comes before C. There is no need to include repeats (e.g. we don’t need B ↔ A, since

it is the same as A ↔ B), or operators that do nothing (e.g. A ↔ A always does

nothing, and A ↔ B does nothing if the message doesn’t contain any A’s or B’s).

Some examples are given below.

>>> print(task2('spain.txt','ABE'))

3

ThbreininSpeinfellsmeinlyinthbmounteins,notpleins.

TheraininSpainfallsmainlyinthemountains,notplains.

TharbininSpbinfbllsmbinlyinthamountbins,notplbins.

>>> print(task2('ai.txt','XZ'))

0

>>> print(task3('cabs.txt','ABZD'))

5

Acbscretcxis.

Bcdscretcxis.

Bczscretcxis.

Dcascretcxis.

Zcascretcxis.

Note: you can adapt your code from Task 1 to help you here.

Task 3 - Goal

Excellent work! Now that we have our successor state program, we’re almost ready

to search! We just need one more ingredient – a goal test! In this task, you will write

a function to check if a given message is valid English, by comparing it to a common

English word list. The function should take three inputs:

1. The name of a text file containing the message

2. The name of a text file containing a list of words, in alphabetical order and

each on a separate line, which will act as a dictionary of correct words

3. A threshold, t, specifying what percentage of words must be correct for this to

count as a goal (given as an integer between 0 and 100). The threshold is

important, because we may need a buffer if our dictionary is missing words,

or there are some misspelt words in the message.

The function should return a string containing two lines of text. The first line should

be "True" if at least t% of the words in the message are correct according to the

dictionary and "False" otherwise. The second line should be the percentage of words

that were correct, to 2 decimal places (round off any further decimal places; 0.005

rounds up to 0.01). Some examples are given below.

>>> print(task3('jingle_bells.txt','dict_xmas.txt',90))

True

90.00

>>> print(task3('fruit_ode.txt','dict_fruit.txt',80))

False

50.00

>>> print(task3('amazing_poetry.txt','common_words.txt',95))

True

95.65

Dictionary matching is case insensitive; if the dictionary contained only the word

'apple', then 'Apple', 'apple', and 'aPPle' in the message should all count as correct

words according to the dictionary. Words are separated by whitespace (space and

newline characters).

Task 4 - DFS, BFS, IDS, UCS

Fantastic! We now have tools to help us generate children and to perform goal

checks. In this task, you will now combine all your work so far to write a function to

perform uninformed searches. It should take six inputs:

1. A character (d, b, i or u) specifying the algorithm (DFS, BFS, IDS and UCS,

respectively)

2. The name of a text file containing a secret message

3. The name of a text file containing a list of words, in alphabetical order and

each on a separate line, which will act as a dictionary of correct words

4. A threshold, t, specifying what percentage of words must be correct for this to

count as a goal (given as an integer between 0 and 100).

5. A string containing the letters that are allowed to be swapped

6. A character (y or n) indicating whether to print the messages corresponding

to the first 10 expanded nodes.

It should then perform DFS, BFS, IDS or UCS to search for a decryption to the given

message, reusing your code from previous tasks if you would like to. Note that

children should be generated in the same order as in Task 2, and you do not need to

handle cycles. In the case of UCS, if two nodes have the same priority for expansion,

you should expand the node that was added to the fringe first, first. Additionally, you

should stop the search if 1000 nodes have been expanded without finding a solution.

The function should return a string. This string must contain the following

information, in order:

1. The decrypted message, key for generating that message and the path cost, if

a solution was found. If no solution was found, the program should print, "No

solution found."

2. The number of nodes expanded during the search. Note that the start node

counts as an expanded node and, in the case of IDS, the final expanded node

count should be the sum of the expanded node counts on each iteration.

3. The maximum number of nodes in the fringe at the same time during the

search

4. The maximum search depth reached. That is, the depth of the deepest

expanded node. Note that the start node has a depth of 0, and its children

have depths of 1.

5. (If indicated with y) the messages corresponding to the first 10 expanded

nodes in the search. If less than 10 nodes were expanded, it should print all

expanded nodes.

Some examples of function calls and results are given below.

>>> print(task4('d','cabs.txt','common_words.txt',100,'ABC','y'))

Nosolutionfound.

Numnodesexpanded:1000

Maxfringesize:2001

Maxdepth:999

Firstfewexpandedstates:

Bcascretcxis.

Acbscretcxis.

Bcascretcxis.

Task 5 - Heuristics

How exciting! We’ve programmed our very own search algorithms! As a reward,

here’s a secret: the message in the introduction was generated by only swapping the

letters, "A", "E", "N", "O", "S" and "T"!

But there’s a problem: if we try running our task 4 program using just these letters,

we’ll find that none of our four search algorithms actually reaches a solution. We’re

going to need something more efficient, so let’s try some informed search

strategies. We need a heuristic. In this task, we will start by developing a heuristic

based on the frequency of English letters. This is the idea: imagine you counted the

frequencies of the letters in the secret message and found that X was most

common. Then, you counted the frequencies of letters in normal English texts, and

found that E was most common. Could you guess what X in the secret message

stood for? (Yes! E!) We will use this idea when developing our heuristic.

(By the way, the process of comparing letter frequencies to decrypt messages is

called frequency analysis, and it can be applied even when the message has no

spaces, punctuation or capitalisation).

According to this table, if we sort the English letters from most frequent to least

frequent, we get E T A O I N S H R D L… If we limit that to just the letters A E N O S

and T (which are the only ones swapped in the secret message), then the ordering

becomes E T A O N S. Your task is to write a function that compares this theoretical

ordering to the letter ordering in a given message, then estimates how many letter

swaps would be needed to make them the same. The function should take

two inputs:

1. The name of a text file containing the message

2. A boolean (either True or False) indicating whether this message corresponds

to a goal node. (We need this because, to be valid, a heuristic must always

estimate the cost at a goal node to be 0)

The program should output 0 if this is a goal node. Otherwise, it should count how

many times the letters A, E, N, O, S, and T occur in the message and sort them from

most common to least common. For example, if T was the most common letter in

the message, followed by E, then O, then A, then S, then N, then the sorted string

would be TEOASN. Note that, if two letters have the same frequency, you should use

alphabetical order to break ties (e.g. A comes before E).

The program should then compare this sorted string to the theoretical goal

(ETAONS) and count how many letters are in the wrong place. For example, all 6

letters are in the wrong place in TEOASN, but only three are wrong for TAEONS.

Finally, the output heuristic value should be ceiling(n/2), where n is the number of

letters out of place, and the ceiling function rounds up to the nearest integer. Thus

we roughly estimate how many swaps we need to make the ordering the same.

Some example function calls and results are given below.

>>> print(task5('freq_eg1.txt',False))

3

>>> print(task5('freq_eg1.txt',True))

0

>>> print(task5('freq_eg2.txt',False))

2

Task 6 - Greedy, A*

In this final task, you should modify your solution to Task 4 to include the greedy and

A* algorithms. The input and output should be in exactly the same format. The only

difference is that the first input can now be d, b, i, u, g or a, where g indicates greedy

search and a indicates A* search. Use the heuristic we developed in Task 5 for these

informed search strategies.

Once you are finished, try running your greedy and A* searches with the following

inputs to decrypt the secret message  :

>>> task6('g','secret_msg.txt','common_words.txt',90,'AENOST','n')

>>> task6('a','secret_msg.txt','common_words.txt',90,'AENOST','n')


相关文章

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:821613408 微信:horysk8 电子信箱:[email protected]
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:horysk8