GithubHelp home page GithubHelp logo

yftah89 / structural-correspondence-learning-scl Goto Github PK

View Code? Open in Web Editor NEW
22.0 2.0 3.0 20.28 MB

The code base for the SCL implementation used in "Neural Structural Correspondence Learning for Domain Adaptation", CoNLL 2017 and in "Pivot Based Language Modeling for Improved Neural Domain Adaptation", NAACL 2018

Python 100.00%
machine-learning natural-language-processing domain-adaptation transfer-learning sentiment-analysis

structural-correspondence-learning-scl's Introduction

Domain Adaptation with Structural Correspondence Learning.

This is a code repository used to generate the SCL's results appearing in Neural Structural Correspondence Learning for Domain Adaptation and Pivot Based Language Modeling for Improved Neural Domain Adaptation.

The SCL algorithm original paper can be found here.

If you use this implementation in your article, please cite :)

@InProceedings{ziser-reichart:2017:CoNLL,
  author    = {Ziser, Yftah  and  Reichart, Roi},
  title     = {Neural Structural Correspondence Learning for Domain Adaptation},
  booktitle = {Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)},
  year      = {2017},  
  pages     = {400--410},	
}

Prerequisites

SCL requires the following packages:

Python >= 2.7

numpy

scipy

scikit-learn

Example

You can find an explained example in run.py:

import tr
import sentiment
if __name__ == '__main__':
    domain = []
    domain.append("books")
    domain.append("kitchen")
    domain.append("dvd")
    domain.append("electronics")

    # making a shared representation for both source domain and target domain
    # first param: the source domain
    # second param: the target domain
    # third param: number of pivots
    # fourth param: appearance threshold for pivots in source and target domain
    tr.train(domain[0],domain[1],500,10)

    # learning the classifier in the source domain and testing in the target domain
    # the results, weights and all the meta-data will appear in source-target directory
    # first param: the source domain
    # second param: the target domain
    # third param: number of pivots
    # fourth param: appearance threshold for pivots in source and target domain
    # fifth param: The SVD dimension
    # sixth param: we use logistic regression as our classifier, it takes the const C for its learning
    sentiment.sent(domain[0],domain[1],500,10,50,0.1)

structural-correspondence-learning-scl's People

Contributors

yftah89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

structural-correspondence-learning-scl's Issues

index out of range error

Hi,

I wonder if the following code can sometimes be problematic: the value of i might exceed its limit before the while loop terminates

while (c<pivot_num):
name= bigram_vectorizer.get_feature_names()[MIsorted[i]]

    s_count = getCounts(X_2_train_source,bigram_vectorizer_source.get_feature_names().index(name)) if name in bigram_vectorizer_source.get_feature_names() else 0
    t_count = getCounts(X_2_train_target, bigram_vectorizer_target.get_feature_names().index(name)) if name in bigram_vectorizer_target.get_feature_names() else 0
    if(s_count>=pivot_min_st and t_count>=pivot_min_st):
        names.append(name)
        pivotsCounts.append(bigram_vectorizer_unlabeled.get_feature_names().index(name))
        c+=1
     #   if(c<100):
          #  print "feature is ",name," it MI is ",RMI[MIsorted[i]]," in source ",s_count," in target ",t_count
    i+=1

Thanks,

Unsatisfying result

First of all, thanks for the code I find it really helpful. After running it the result I got is not as good as the result in the original paper. Would you please help to explain what might be the reason here as by going through the code it all looks perfect to me.

(adapt from book to kitchen)
dim = 50 on dev : rep = 0.7025 , non = 0.7825 all = 0.79
, on target: rep = 0.666 , non = 0.7555 all = 0.7575 c_parm = 0.1

dim = 100 on dev : rep = 0.7125 , non = 0.7825 all = 0.785
, on target: rep = 0.6925 , non = 0.7555 all = 0.7625 c_parm = 0.1

dim = 150 on dev : rep = 0.745 , non = 0.7825 all = 0.795
, on target: rep = 0.7135 , non = 0.7555 all = 0.75 c_parm = 0.1

Larger unlabeled dataset

Hi it's me again...

I notice a tiny inconsistency in your implementation against the paper. It seems the unlabeled data you used (from 6001 book to 34742 dvd) was way more than Blitzer used (from 3685 to 5945). I am not if you have used all of them in your actual experiment so I have to confirm this from you.

Also I wonder if you can recall how exactly did you collect the data. It seems there are two amazon datasets (a big one and a small one) according to this paper:
http://www.icml-2011.org/papers/342_icmlpaper.pdf
And clearly Blitzer used the small one in this SCL paper. But in your implementation, the labeled data you used has the same size as Blitzer's (2000 positive, 2000 negative for each domain), I just wanted to know where did you get the unlabeled data from as it seems the small amazon dataset doesn't seem to have that much data.

Thanks as always

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.