Keshawn_lu's Blog

Keshawn_lu's Blog

Become a better myself.

吴恩达团队NLP C3_W3_Assignment
吴恩达团队NLP C3_W3_Assignment任务:命名实体识别(NER) French:地缘政治实体 Morocco:地理实体 Christmas:时间指标 其他不被视为命名实体 Part1:数据生成器 shuffle的好处:我们不使用索引直接访问句子列表的位置。相反,我们使用它从索引列表中选择一个索引。通过这种方式,我们可以改变遍历原始列表的顺序,保持原始列表不变。 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556de...
吴恩达团队NLP C3_W2_Assignment
吴恩达团队NLP C3_W2_Assignment任务:探索递归神经网络RNNPart1:将一行字符串中的字符都转化为unicode整数,将其称之为tensor(张量)12345678910111213141516171819def line_to_tensor(line, EOS_int=1): """Turns a line of text into a tensor Args: line (str): A single line of text. EOS_int (int, optional): End-of-sentence integer...
吴恩达团队NLP C3_W1_Assignment
吴恩达团队NLP C3_W1_Assignment任务:使用深度神经网络进行情感分析Part1:准备数据1.1 以8 : 2的比例准备训练集和验证集123456789101112131415161718192021222324import numpy as npall_positive_tweets, all_negative_tweets = load_tweets()print(f"The number of positive tweets: {len(all_positive_tweets)}")print(f"The number of negative tw...
吴恩达团队NLP C2_W4_Assignment
吴恩达团队NLP C2_W4_Assignment任务:计算单词嵌入并用于情感分析Part1:The Continuous bag of words model(CBOW)在这个模型下,我们给出上下文的单词,并尝试判断中间词。 如有个字符串为:I am happy because I am learning 我们设C = 2 ,当我们要预测happy时,则: 𝐶 words before: [I, am] 𝐶 words after: [because, I] 模型结构图如下: 其中$\bar x$为: 公式如下: \begin{align} h &= W_1 \ X + ...
吴恩达团队NLP C2_W3_Assignment
吴恩达团队NLP C2_W3_Assignment任务:自动完成系统Part1:处理数据1.1 将文本数据切割成一行一个字符串12345678910111213141516171819def split_to_sentences(data): """ Split data by linebreak "\n" Args: data: str Returns: A list of sentences """ sentences = data.split('\n') # Additional cle...
吴恩达团队NLP C2_W2_Assignment
吴恩达团队NLP C2_W2_Assignment任务:词性标注(POS)Part1:Training1.1 构建 transition_counts, emission_counts, tag_counts transition_counts 计算的是每个tag与另一个tag相邻的次数 → (prev_tag, tag), 便于之后计算$P(ti |t{i-1})$ emission_counts 计算的是word与相应tag出现的次数 → (tag, word),便于之后计算$P(w_i|t_i)$ tag_counts 计算的是每个tag出现的次数 123456789101112...
吴恩达团队NLP C2_W1_Assignment
吴恩达团队NLP C2_W1_Assignment任务:自动校正单词拼写Part1:处理数据将字符串数据读入,并将字符串都转化为小写,最后通过正则表达式将其中的单词找出来。 1234567891011121314151617def process_data(file_name): """ Input: A file_name which is found in your current directory. You just have to read it in. Output: words: a list containing all...
吴恩达团队NLP C1_W4_Assignment
吴恩达团队NLP C1_W4_Assignment课程链接:Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera 任务:朴素机器翻译及LSH(局部敏感度哈希)Part1: 实现英语转法语1.1 生成embedding和转换矩阵1234567891011121314151617181920212223242526272829303132333435def get_matrices(en_fr, french_vecs, english_vecs): """ ...
吴恩达团队NLP C1_W3_Assignment
吴恩达团队NLP C1_W3_Assignment课程链接:Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera 任务:学习词向量,预测单词的类比,通过PCA降维,通过余弦相似性来比较单词。Part1: 余弦相似性余弦相似性可由下述公式表示: \cos (\theta)=\frac{\mathbf{A} \cdot \mathbf{B}}{\|\mathbf{A}\|\|\mathbf{B}\|}=\frac{\sum_{i=1}^{n} A_{i} B_{i}}{...
吴恩达团队NLP C1_W2_Assignment
吴恩达团队NLP C1_W2_AssignmentCoursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera 任务:通过朴素贝叶斯的方法来对tweet进行情感分析 Part 1: 处理数据首先对数据进行预处理,删除杂音数据,并对tweet进行分词。 123456custom_tweet = "RT @Twitter @chapagain Hello There! Have a great day. :) #good #morning http://chapagain.co...
avatar
鸣蜩十九
Always
友链
CSDN BiliBili