逻辑回归模型——python代码实现

系统 1525 0
  • 逻辑回归的推导过程:https://blog.csdn.net/ACM_hades/article/details/90448785
  • 代码主要实现了下面公式: W k + 1 = W k + λ X ( Y − f W k ( X T ) ) W^{k+1}=W^k+λX(Y-f_{W^k } (X^T)) W k + 1 = W k + λ X ( Y f W k ( X T ) )
  • 数据集 :我们选择MNIST数据集进行实验,它包含各种手写数字(0-9)图片,图片大小28*28。MNIST数据集本身有10个类别,为了将其变成二分类问题我们进行如下处理:label等于0的继续等于0,label大于0改为1。这样就将十分类的数据改为二分类的数据。
  • 特征选择 :可选择的特征有很多,包括:
    • 自己提取特征
    • 将整个图片作为特征向量
    • HOG特征
  • 我们将整个图片作为特征(784=28×28)。
            
              
                import
              
               time

              
                import
              
               numpy 
              
                as
              
               np

              
                import
              
               pandas 
              
                as
              
               pd

              
                from
              
               sklearn
              
                .
              
              model_selection 
              
                import
              
               train_test_split

              
                from
              
               sklearn
              
                .
              
              metrics 
              
                import
              
               accuracy_score



              
                class
              
              
                Logistic
              
              
                :
              
              
                def
              
              
                __init__
              
              
                (
              
              self
              
                ,
              
              feature_len
              
                )
              
              
                :
              
              
        self
              
                .
              
              weights 
              
                =
              
               np
              
                .
              
              ones
              
                (
              
              
                (
              
              feature_len
              
                ,
              
              
                1
              
              
                )
              
              
                )
              
              
                def
              
              
                model_function
              
              
                (
              
              self
              
                ,
              
               X
              
                )
              
              
                :
              
              
        W_X
              
                =
              
              np
              
                .
              
              matmul
              
                (
              
              X
              
                ,
              
               self
              
                .
              
              weights
              
                )
              
              
        temp_1
              
                =
              
              
                (
              
              W_X
              
                >=
              
              
                0
              
              
                )
              
              
                .
              
              astype
              
                (
              
              np
              
                .
              
              
                float
              
              
                )
              
              
        temp_0 
              
                =
              
              
                (
              
              W_X 
              
                <
              
              
                0
              
              
                )
              
              
                .
              
              astype
              
                (
              
              np
              
                .
              
              
                float
              
              
                )
              
              
        resut_1
              
                =
              
              
                1.0
              
              
                /
              
              
                (
              
              
                1
              
              
                +
              
               np
              
                .
              
              exp
              
                (
              
              
                -
              
              temp_1
              
                *
              
              W_X
              
                )
              
              
                )
              
              
                *
              
              temp_1
        
              
                #W_X为负数是,因为参数值inx很大时,exp(inx)可能会发生溢出,所以修改计算方式
              
              
        resut_0 
              
                =
              
               np
              
                .
              
              exp
              
                (
              
              temp_0
              
                *
              
              W_X
              
                )
              
              
                /
              
              
                (
              
              
                1
              
              
                +
              
               np
              
                .
              
              exp
              
                (
              
              temp_0
              
                *
              
              W_X
              
                )
              
              
                )
              
              
                *
              
              temp_0
        
              
                return
              
               resut_1
              
                +
              
              resut_0

    
              
                def
              
              
                train
              
              
                (
              
              self
              
                ,
              
               Data
              
                ,
              
               label
              
                )
              
              
                :
              
              
                #训练
              
              
        label 
              
                =
              
               label
              
                .
              
              reshape
              
                (
              
              
                (
              
              
                -
              
              
                1
              
              
                ,
              
              
                1
              
              
                )
              
              
                )
              
              
        alpha 
              
                =
              
              
                0.01
              
              
        max_iter 
              
                =
              
              
                500
              
              
                for
              
               i 
              
                in
              
              
                range
              
              
                (
              
              max_iter
              
                )
              
              
                :
              
              
                #迭代
              
              
            pres 
              
                =
              
               self
              
                .
              
              model_function
              
                (
              
              Data
              
                )
              
              
            error 
              
                =
              
               label
              
                -
              
              pres    
              
                #预测值和标签值所形成的误差
              
              
            self
              
                .
              
              weights 
              
                =
              
               self
              
                .
              
              weights 
              
                +
              
                alpha 
              
                *
              
               np
              
                .
              
              matmul
              
                (
              
              Data
              
                .
              
              T 
              
                ,
              
               error
              
                )
              
              
                #权重的更新
              
              
                def
              
              
                predict
              
              
                (
              
              self
              
                ,
              
               Data
              
                )
              
              
                :
              
              
                return
              
               self
              
                .
              
              model_function
              
                (
              
              Data
              
                )
              
              
                .
              
              reshape
              
                (
              
              
                -
              
              
                1
              
              
                )
              
              
                if
              
               __name__ 
              
                ==
              
              
                '__main__'
              
              
                :
              
              
                print
              
              
                (
              
              
                'Start read data'
              
              
                )
              
              
    S 
              
                =
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
    raw_data 
              
                =
              
               pd
              
                .
              
              read_csv
              
                (
              
              
                './lihang_book_algorithm-master/data/train_binary.csv'
              
              
                )
              
              
                # 读取数据
              
              
    data 
              
                =
              
               raw_data
              
                .
              
              values  
              
                # 获取数据
              
              
                print
              
              
                (
              
              
                "data shape:"
              
              
                ,
              
               data
              
                .
              
              shape
              
                )
              
              
    imgs 
              
                =
              
               data
              
                [
              
              
                :
              
              
                ,
              
              
                1
              
              
                :
              
              
                ]
              
              
    labels 
              
                =
              
               data
              
                [
              
              
                :
              
              
                ,
              
              
                0
              
              
                ]
              
              
                print
              
              
                (
              
              
                "imgs shape:"
              
              
                ,
              
               imgs
              
                .
              
              shape
              
                )
              
              
    imgs 
              
                =
              
               np
              
                .
              
              concatenate
              
                (
              
              
                (
              
              imgs
              
                ,
              
               np
              
                .
              
              ones
              
                (
              
              
                (
              
              imgs
              
                .
              
              shape
              
                [
              
              
                0
              
              
                ]
              
              
                ,
              
              
                1
              
              
                )
              
              
                )
              
              
                )
              
              
                ,
              
               axis
              
                =
              
              
                1
              
              
                )
              
              
                #拼接常数项
              
              
                print
              
              
                (
              
              
                "imgs shape:"
              
              
                ,
              
               imgs
              
                .
              
              shape
              
                )
              
              
                print
              
              
                (
              
              
                "labels shape:"
              
              
                ,
              
               labels
              
                .
              
              shape
              
                )
              
              
                print
              
              
                (
              
              
                "label:"
              
              
                ,
              
              
                list
              
              
                (
              
              
                set
              
              
                (
              
              labels
              
                )
              
              
                )
              
              
                )
              
              
    Model 
              
                =
              
               Logistic
              
                (
              
              imgs
              
                .
              
              shape
              
                [
              
              
                -
              
              
                1
              
              
                ]
              
              
                )
              
              
                # 选取 2/3 数据作为训练集, 1/3 数据作为测试集
              
              
    train_features
              
                ,
              
               test_features
              
                ,
              
               train_labels
              
                ,
              
               test_labels 
              
                =
              
               train_test_split
              
                (
              
              
        imgs
              
                ,
              
               labels
              
                ,
              
               test_size
              
                =
              
              
                0.33
              
              
                ,
              
               random_state
              
                =
              
              
                23323
              
              
                )
              
              
                print
              
              
                (
              
              
                "train data count :%d"
              
              
                %
              
              
                len
              
              
                (
              
              train_labels
              
                )
              
              
                )
              
              
                print
              
              
                (
              
              
                "test data count :%d"
              
              
                %
              
              
                len
              
              
                (
              
              test_labels
              
                )
              
              
                )
              
              
                print
              
              
                (
              
              
                'read data cost '
              
              
                ,
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
                -
              
               S
              
                ,
              
              
                ' second'
              
              
                )
              
              
                print
              
              
                (
              
              
                'Start training'
              
              
                )
              
              
    S 
              
                =
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
    Model
              
                .
              
              train
              
                (
              
              train_features
              
                ,
              
               train_labels
              
                )
              
              
                print
              
              
                (
              
              
                'training cost '
              
              
                ,
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
                -
              
               S
              
                ,
              
              
                ' second'
              
              
                )
              
              
                print
              
              
                (
              
              
                'Start predicting'
              
              
                )
              
              
    S 
              
                =
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
    test_predict 
              
                =
              
               Model
              
                .
              
              predict
              
                (
              
              test_features
              
                )
              
              
                print
              
              
                (
              
              
                'predicting cost '
              
              
                ,
              
               time
              
                .
              
              time
              
                (
              
              
                )
              
              
                -
              
               S
              
                ,
              
              
                ' second'
              
              
                )
              
              

    score 
              
                =
              
               accuracy_score
              
                (
              
              test_labels
              
                ,
              
               test_predict
              
                )
              
              
                print
              
              
                (
              
              
                "The accruacy socre is "
              
              
                ,
              
               score
              
                )
              
            
          
            
              结果:
	Start read data
	data shape: (42000, 785)
	imgs_origin shape: (42000, 784)
	imgs shape: (42000, 785)
	labels shape: (42000,)
	label: [0, 1]
	train data count :28140
	test data count :13860
	read data cost  4.148890018463135  second
	Start training
	training cost  15.161401748657227  second
	Start predicting
	predicting cost  0.007978200912475586  second
	The accruacy socre is  0.9892496392496393

            
          

更多文章、技术交流、商务合作、联系博主

微信扫码或搜索:z360901061

微信扫一扫加我为好友

QQ号联系: 360901061

您的支持是博主写作最大的动力,如果您喜欢我的文章,感觉我的文章对您有帮助,请用微信扫描下面二维码支持博主2元、5元、10元、20元等您想捐的金额吧,狠狠点击下面给点支持吧,站长非常感激您!手机微信长按不能支付解决办法:请将微信支付二维码保存到相册,切换到微信,然后点击微信右上角扫一扫功能,选择支付二维码完成支付。

【本文对您有帮助就好】

您的支持是博主写作最大的动力,如果您喜欢我的文章,感觉我的文章对您有帮助,请用微信扫描上面二维码支持博主2元、5元、10元、自定义金额等您想捐的金额吧,站长会非常 感谢您的哦!!!

发表我的评论
最新评论 总共0条评论